Handbook of Unconventional Computing: Theory / Implementations. 2 volumes (Wspc Book Series in Unconventional Computing) 9811235031, 9789811235030

Did you know that computation can be implemented with cytoskeleton networks, chemical reactions, liquid marbles, plants,

217 87 37MB

English Pages 1208 [1207] Year 2021

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Reservoir Computing: Theory, Physical Implementations, and Applications (Natural Computing Series) [1st ed. 2021] 9811316864, 9789811316869

This book is the first comprehensive book about reservoir computing (RC). RC is a powerful and broadly applicable comput

411 122 21MB Read more

Reservoir Computing: Theory, Physical Implementations, and Applications 9789811316876, 9811316872

890 94 21MB Read more

Unconventional Approaches in Modern Chess [2] 9789492510785

A game as complex as chess can be approached in an infinite number of ways. Nowadays, in the era of computer chess, GMs

590 204 15MB Read more

An Unconventional Family 9780300148053

In 1965, when psychologists Sandra and Daryl Bem met and married, they were determined to function as truly egalitarian

186 64 9MB Read more

Naturebot: Unconventional Visions of Nature 0367607794, 9780367607791

Naturebot: Unconventional Visions of Nature presents a humanities-oriented addition to the literature on biomimetics and

366 88 5MB Read more

The Reception Of Unconventional Science 9780367295455

257 39 207KB Read more

Unconventional Approaches in Modern Chess [1] 9789492510488

In this highly original book, Grandmaster Ipatov shares the chess philosophy that helped him become a top chess grandmas

1,013 299 17MB Read more

Lost in Transplantation: Memoir of an Unconventional Organ Donor 1179720000

One Gently Used Kidney, Free to a Good Home. When 48 year-old single mother, massage therapist and returning student Ell

396 104 296KB Read more

Jean Monnet: Unconventional Statesman 9781626372382

How did Jean Monnet, an entrepreneurial internationalist who never held an elective office, never joined a political par

148 64 2MB Read more

Instructor’s manual Theory of computing Solutions

Solutions for the book Theory of computing Efim Kinber and Carl Smith.

1,649 214 16MB Read more

Handbook of Unconventional Computing: Theory / Implementations. 2 volumes (Wspc Book Series in Unconventional Computing)
9811235031, 9789811235030

Author / Uploaded
Andrew Adamatzky (editor)

Table of contents :
Volume 1 : Theory
Contents
Preface
Chapter 1. Mapping the Territory of Computation Including Embodied Computation
1.1. Unconventional Computation
1.1.1. Implications
1.2. Computation in General
1.2.1. Topology of information representation
1.2.2. Topology of information processing
1.2.3. Programmability
1.2.4. Universality
1.3. Embodied Computation
1.3.1. Definition
1.3.2. Physics for computational purposes
1.3.2.1. Transduction
1.3.2.2. Analog computation
1.3.2.3. Quantum computation
1.3.2.4. Field computation
1.3.3. Computation for physical purposes
1.4. Programmable Matter
1.5. Artificial Morphogenesis
1.6. Conclusions
References
Chapter 2. Reversible Logic Element with Memory as an Alternative Logical Device
2.1. Introduction
2.2. Reversible Logic Element with Memory (RLEM)
2.2.1. Definition of a reversible logic element with memory
2.2.2. Rotary element (RE), a typical RLEM
2.2.3. Constructing reversible machines by REs
2.3. All Non-degenerate 2-State RLEMs Except Four are Universal
2.4. Realizing RLEMs in Reversible Environments
2.4.1. RLEMs in the billiard ball model
2.4.2. RLEM in a simple reversible cellular automaton
2.5. Concluding Remarks
References
Chapter 3. Space Bounded Scatter Machines
3.1. Introduction
3.2. First Concepts and Promised Results
3.3. The Experiment, the Protocols and the Machine
3.4. The Standard Scatter Machine
3.4.1. Probabilistic trees
3.4.2. Sparse oracles
3.4.3. Coding and decoding the vertex position
3.4.4. Lower bounds
3.4.5. Upper bounds
3.5. The Generalized Scatter Machine
3.5.1. Lower bounds
3.5.2. Upper bounds
3.6. Conclusion
Acknowledgments
References
Chapter 4. Exclusively Quantum: Computations Only Possible on a Quantum Computer
4.1. Introduction
4.2. Background: Parallelism and Quantum Theory
4.2.1. On the importance of parallelism
4.2.2. Quantum computation and quantum information
4.2.2.1. The qubit
4.2.2.2. Measurements
4.2.2.3. Putting qubits together
4.2.2.4. Entanglement
4.3. Some History
4.3.1. Quantum versus classical computation
4.3.2. A review of previous results
4.3.2.1. True randomness
4.3.2.2. Entangled states
4.3.2.3. Quantum speed-up
4.3.2.4. Quantum simulations
4.3.2.5. QTM versus DTM and PTM
4.3.2.6. Quantum versus classical complexity
4.3.2.7. Super-Turing computations
4.4. Unconventional Computations and Non-universality
4.5. Evolving Computations
4.5.1. Evolving computational complexity
4.5.1.1. Rank-varying computational complexity
4.5.1.2. Time-varying computational complexity
4.5.2. Evolving computational variables
4.5.2.1. Time-varying variables
4.5.2.2. Interacting variables
4.5.2.3. Computations obeying a global constraint
4.6. Quantum Fourier Transform
4.6.1. Rank-varying complexity
4.6.2. Semi-classical solution
4.6.3. Parallel approach
4.6.4. Quantum decoherence
4.6.5. Time-varying variables
4.7. Quantum Error-correction
4.7.1. Quantum codes
4.7.2. Time-varying complexity
4.7.3. Error correction via symmetrization
4.8. Entanglement Revisited
4.8.1. Interacting variables
4.8.2. Quantum distinguishability
4.8.2.1. Generalization
4.8.3. Another approach to distinguishability
4.8.4. Some consequences
4.8.4.1. Conveying quantum information through a classical channel
4.8.4.2. Protecting quantum information from classical attacks
4.8.5. Transformations obeying a global constraint
4.9. Conclusion
References
Chapter 5. Estimations of Integrated Information Based on Algorithmic Complexity and Dynamic Querying
5.1. Introduction
5.2. Basic Concepts of Integrated Information Theory
5.2.1. Calculus of ϕ
5.3. Methods
5.3.1. Programmability test and meta-test
5.3.2. Causal perturbation analysis
5.3.3. Causal influence and sublog program-size divergence
5.3.4. A simplicity versus complexity test
5.4. Numerical Results
5.4.1. Compression sensitivity as informative of integration
5.4.2. Finding simple rules in complex behavior
5.4.3. Simple rules and the black pattern of distribution of information
5.4.3.1. Automatic meta-perturbation test
5.4.3.2. Shrinking after dividing to rule
5.4.4. Limitations
5.5. Conclusions and Future Directions
References
Appendix A
A.1. How a meta-perturbation test works
Appendix B
Chapter 6. Robot Narratives
6.1. Introduction
6.2. Robots Explore an Alien Planet
6.2.1. Prologue
6.2.2. Greek Olympus, narrative robots
6.2.3. Roman Olympus, declarative robots
6.2.4. Epilogue
6.3. Robot Imaginations
6.3.1. Dennett’s Tower
6.3.2. Robots in the tower
6.3.3. An architecture for an ethical Popperian robot
6.3.4. From Popperian to Gregorian robots
6.3.5. Narrative logic
6.4. Gregorian Chat
6.4.1. The narrative hypothesis in more detail
6.4.2. The Gregorian chat system
6.4.2.1. Robot architecture
6.4.2.2. Hybrid physical/virtual environment architecture
6.4.3. Testing the narrative hypothesis in a robot ecology
6.4.4. Extensions of the approach
6.5. Discussion and Conclusions
6.5.1. Communication and cognition
6.5.2. Social robots
6.5.3. Narrative logic and its interface with world modeling in artificial intelligence
6.5.4. Beyond a “repository of actions”: The particular and the general in narrative
6.5.5. Story generator and story parser
6.5.6. Preparing for the future
Acknowledgments
References
Chapter 7. Evolving Boolean Regulatory Networks with Variable Gene Expression Times
7.1. Introduction
7.2. The RBNK Model
7.3. The RBNK Model with Variable Gene Expression Times
7.3.1. Gene expression
7.3.2. Experimentation
7.3.3. Asynchronous experimentation
7.4. Variable Sized GRN with Variable Gene Expression Times
7.4.1. Emergent complexity
7.4.2. Experimentation
7.5. Conclusions
References
Chapter 8. Membrane Computing Concepts, Theoretical Developments and Applications
8.1. Introduction
8.2. Types of P Systems
8.2.1. Preliminaries
8.2.2. Transition P systems
8.2.3. P systems with active membranes
8.2.4. Communication P systems
8.2.5. Tissue-like P systems with symport/antiport rules
8.2.6. Spiking neural P systems
8.2.7. Enzymatic numerical P systems
8.3. Computing Power and Computational Efficiency
8.3.1. Introduction
8.3.2. Computing power
8.3.2.1. Rewriting membrane systems
8.3.3. Computational efficiency
8.3.3.1. Recognizer membrane systems
8.3.3.2. Polynomial time complexity classes
8.3.3.3. Limits on efficient computations
8.3.3.4. Solving computationally hard problems
8.4. Applications of Membrane Computing
8.4.1. Modeling ecosystems with population dynamics P systems
8.4.2. Path planning and control of mobile robots
8.4.3. Fault diagnosis with spiking neural P systems
8.4.4. Other applications
8.4.5. Concluding remarks
8.5. Implementation of P Systems
8.5.1. Software implementations
8.5.2. GPU-based hardware implementations
8.5.2.1. GPU computing
8.5.2.2. GPU simulators of P systems
8.5.3. FPGA-based hardware implementations
8.5.4. Discussion
8.6. Concluding Remarks
References
Chapter 9. Computing with Modest Resources: How to Have Your Cake and Eat it Too
9.1. Introduction
9.2. Why Bigger is Not Necessarily Smarter
9.3. Why Smaller Might be Smarter
9.4. What is External Drive and How Does It Help
9.5. A Thought Experiment: Maxwell’s Daemon Rebooted
9.6. Example: Memristor Networks
9.6.1. Memristor model
9.7. Conclusions
References
Chapter 10. Physical Randomness Can Help in Computations
10.1. Introduction
10.2. Randomness is Important
10.3. How Can We Describe Randomness in Precise Terms?
10.4. Back to Physics
10.5. How Does This Affect Computations
Acknowledgments
References
Chapter 11. Probabilistic Logic Gate in Asynchronous Game of Life with Critical Property
11.1. Introduction
11.2. Asynchronous Game of Life and its Logic Gates
11.2.1. Phase transition and criticality
11.2.2. Computation by asynchronous GL
11.2.3. Logic gate in asynchronous GL
11.3. Discussion
11.4. Conclusion
Acknowledgements
References
Chapter 12. A Mystery of Human Biological Development — Can It Be Used to Speed up Computations?
12.1. Formulation of the Problem
12.2. Exponential Speedup Phenomenon: A Brief History and a Brief Description
Acknowledgments
References
Chapter 13. Symmetric Automata and Computations
13.1. Introduction
13.2. Structure of Abstract Automata and Instruction Machines
13.3. Structure of Symmetric Automata and Machines
13.4. Functional Characteristics of Operationally Symmetric Turing Machines
13.5. Conclusion
References
Chapter 14. Computation By Biological Means
14.1. Introduction
14.1.1. Computation
14.1.2. Use of biology for computation
14.1.3. What can biocomputers do?
14.2. DNA Computing
14.2.1. Origins of DNA computing
14.2.2. Computing with DNA
14.2.3. Adleman’s experiment
14.2.4. Sticker systems
14.2.5. Recent progress on computational power of DNA
14.2.6. Limitations of DNA
14.3. Computation with Slime Mould
14.3.1. Introduction to slime moulds
14.3.2. Slime moulds for computational tasks
14.3.3. Amoeboid organisms as a computer
14.3.4. Slime moulds as a form of computation
14.3.5. Select applications of slime moulds
14.3.6. Challenges of computing with slime moulds
14.4. Computation with Motile Biological Agents
14.4.1. Molecular motors
14.4.2. Use of biological motion in confined structures
14.4.3. Issues with mobile biologicals
14.5. Computation with Synthetic Biology
14.5.1. Chemotaxis
14.5.2. Saccharum saccharomyces and genetic toggles
14.5.3. Cellular computation
14.5.4. Multicellular machines
14.6. Differences in Biological Computing Paradigms
14.7. Discussion
14.7.1. Critical research gaps in biological computing
14.7.2. The case for biosupremacy
References
Chapter 15. Swarm and Stochastic Computing for Global Optimization
15.1. Introduction
15.2. Optimization
15.3. Stochastic Enhancements
15.4. Evolutionary Computation
15.5. Nature-Inspired Computing
15.5.1. Non-SI-based approaches
15.5.2. SI-based approaches
15.6. Discussions
References
Chapter 16. Vector Computation
16.1. Epistemology versus Ontology in the Quantum Computation Context
16.2. Types of Quantum Oracles for Randomness: Pure States in a Superposition Versus Mixed States
16.3. Questionable Parallelism by Views on a Vector
16.4. Computation by Projective Measurements of Partitioned State Space
16.5. Entanglement as Relational Parallelism Across Multi-Partite States
16.6. On Partial Views of Vectors
16.7. Summary
Acknowledgments
References
Chapter 17. Unsupervised Learning Approach Using Reinforcement Techniques on Bio-inspired Topologies
17.1. Introduction
17.1.1. Molecular networks
17.1.2. Cellular automata
17.1.3. Conway’s Game-of-Life
17.1.4. Neuromorphic computing systems
17.2. Molecular-based Topology
17.3. Artificial Neural Networks
17.4. Neuron Model
17.4.1. Simple Izhikevich model
17.5. Excitation Reinforcement
17.5.1. Majority-rule
17.5.2. Game-of-Life rule
17.6. Unsupervised Learning
17.6.1. Majority-rule learning
17.6.2. Game-of-Life learning
17.6.3. Hebbian learning
17.7. Training and Classification
17.8. Results
17.9. Discussion
Acknowledgement
References
Chapter 18. Intelligent Gamesourcing — Artificial Intelligence in Problem Solution by Game Playing
18.1. Introduction
18.1.1. Gamesourcing
18.1.2. History
18.2. Motivation
18.2.1. Leisure
18.2.2. Game playing
18.2.3. Paradigm
18.2.4. Structure
18.2.5. Fun
18.2.6. Credibility of outputs
18.2.7. Evaluation of success
18.2.8. Current status
18.2.8.1. 1.4.1 Astro Drone
18.2.8.2. EteRNA
18.2.8.3. Foldit
18.2.8.4. Play to cure: Genes in space
18.2.8.5. Sea Hero Quest
18.3. Application — People versus Differential Evolution in Search of the Shortest Path
Acknowledgment
References
Index
Volume 2 : Implementations
Contents
Preface
Chapter 1. From Oscillatory Reactionsto Robotics: A Serendipitous Journey Through Chemistry, Physics and Computation
1.1. Introduction
1.2. Systems Dynamics as a Computational Platform
1.2.1. Wet oscillating systems
1.2.2. Electrochemical oscillators
1.3. Computation and Control in Dynamic Systems
1.3.1. Computation in memristive devices and systems
1.3.1.1. Logic design with memristors/memristive devices
1.3.1.2. Matrix vector multiplication
1.3.1.3. Hardware artificial neural networks
1.3.2. Principles of control in dynamic systems — PID case
1.3.3. Reservoir computing
1.3.4. Reservoir computing and control systems
1.4. Controllers Beyond PID: Fuzzy and Neuromorphic
1.4.1. Fuzzy logic
1.4.2. Processing fuzzy logic by using molecules
1.4.3. Implementation of fuzzy logic systems in solid-state devices
1.4.4. Neuromorphic devices
1.5. Alternative Computing in Autonomous Robotics
1.5.1. Amoeba-based solution search system and electronic amoeba
1.5.2. Amoeba-inspired autonomously walking robot
1.5.3. Physicality for the identification of ground condition
1.5.4. Integration of reinforcement learning for efficient walking
1.6. Alternative Computing and Soft Robotics
1.7. Concluding Remarks
Acknowledgments
References
Chapter 2. Computing by Chemistry: The Native Chemical Automata
2.1. Introduction
2.2. A Brief History
2.3. How Native Chemical Automata are Built in Practice
2.3.1. Selecting the language-automata pair, and the chemistry
2.3.2. Initial conditions and alphabet symbol assignment
2.3.3. Recipe quantification and selection of time interval
2.3.4. Accept/reject criteria optimization
2.4. Reconfiguration and Variants/Extension of Native Chemical Automata
2.4.1. Inclusive hierarchy and automata reconfigurability
2.4.2. Extension to continuous operation (CSTR reactor)
2.4.3. Coupling of Belousov–Zhabotinsky to self-assembly
2.5. Conclusions and Outlook
References
Chapter 3. Discovering Boolean Functions on Actin Networks
3.1. Introduction
3.2. The Actin Network
3.3. Spike-based Gates
3.3.1. Automaton model
3.3.2. Interfacing with the network
3.3.3. Maximizing a number of logical gates
3.3.4. Actin droplet machine
3.4. Voltage-based Gates
3.4.1. The model
3.4.2. Extension to bundle networks
3.4.3. The network
3.4.4. Preliminary results
3.4.5. Results
3.4.5.1. Ideal electrodes
3.4.5.2. Boolean gates
3.4.5.3. Realistic electrodes
3.4.6. Finite state machine
3.4.6.1. Using two values of k = 4 and k = 6
3.5. Discussion
Acknowledgments
References
Chapter 4. Implication and Not-Implication Boolean Logic Gates Mimicked with Enzyme Reactions — General Approach and Application to Signal-Triggered Biomolecule Release Processes
4.1. Introduction
4.2. Mimicking IMPLY Logic Gate
4.3. Mimicking INHIB Logic Gate
4.4. Using the IMPLY and INHIB Logic Gates for Stimulating Molecule Release Function
4.5. Conclusions
Appendix
Acknowledgments
References
Chapter 5. Molecular Computation via Polymerase Strand Displacement Reactions
5.1. Introduction
5.1.1. Logic circuits
5.1.2. Chemical reaction networks
5.1.3. Chapter organization
5.2. Strand Displacement
5.3. Using Strand Displacing Polymerase for Computation
5.4. CRNs Using Strand Displacing Polymerase
5.5. Methods and Protocol
5.5.1. Oligonucleotide design, synthesis, and purification
5.5.2. Fluorescence sample preparation and measurement
5.6. Discussion and Outlook
References
Chapter 6. Optics-Free Imaging with DNA Microscopy: An Overview
6.1. Introduction
6.2. DNA Microscopy for Surface 2D Imaging
6.3. DNA Microscopy for Volumetric 3D Imaging
6.4. Conclusions and Outlook
Acknowledgments
References
Chapter 7. Fully Analog Memristive Circuits for Optimization Tasks: A Comparison
7.1. Introduction
7.2. Dynamical Equation for Memristor Circuits
7.2.1. Single memristor and Lyapunov function
7.2.2. Circuits
7.2.3. Lyapunov function for memristor circuits
7.2.4. Number of fixed points and stability
7.3. Analysis and Comparisons
7.3.1. The instances
7.3.2. Minimization of the continuous Lyapunov function
7.4. Conclusions
Acknowledgments
References
Chapter 8. Organic Memristive Devices for Bio-inspired Applications
8.1. Introduction
8.2. Organic Memristive Device
8.3. Adaptive Circuits
8.4. Relationship of Optical and Electrical Properties
8.5. Neuromorphic Applications
8.5.1. Frequency dependent plasticity
8.5.2. Nervous system mimicking circuits
8.5.3. Towards synapse prosthesis
8.5.4. Stochastic self-organized computational systems
8.6. Conclusions
References
Chapter 9. On Wave-Based Majority Gates with Cellular Automata
9.1. Introduction
9.2. Propagation Patterns in Life-like Rules
9.3. MAJORITY Gates by Competing Patterns
9.4. Final Notes
References
Chapter 10. Information Processing in Plants: Hormones as Integrators of External Cues into Plant Development
10.1. Introduction
10.2. Hormonal Encoding of Environmental Information
10.3. Self-regulatory Hormonal Network Underlying Plasticity in Plant Development
10.3.1. Information processing in the transition from dormancy to germination
10.4. Concluding Remarks
References
Chapter 11. Hybrid Computer Approach to Train a Machine Learning System
11.1. Introduction
11.1.1. A brief introduction to artificial intelligence and machine learning
11.1.2. Analog versus digital computing
11.1.3. Balancing an inverse pendulum using reinforcement learning
11.2. The Analog Simulation
11.3. The Reinforcement Learning System
11.3.1. Value function
11.3.2. Q-learning algorithm
11.3.3. Python implementation
11.3.3.1. States
11.3.3.2. Actions
11.3.3.3. Modeling the action value function Q(s, a)
11.3.3.4. Feature transformation to avoid underfitting
11.3.3.5. Decaying α and ε to improve learning and to avoid overfitting
11.3.4. Hybrid interface
11.4. Results
Acknowledgement
References
Chapter 12. On the Optimum Geometry and Training Strategy for Chemical Classifiers that Recognize the Shape of a Sphere
12.1. Introduction
12.2. The Evolutionary Optimization of Chemical Classifiers
12.3. Results
12.4. Conclusions
Acknowledgments
References
Chapter 13. Sensing and Computing with Liquid Marbles
13.1. Introduction
13.2. Collision-based Gate
13.3. Belousov–Zhabotinsky Cargo
13.4. Photosensor
13.5. Thermal Sensor
13.6. Robot Controller
13.7. Neuromorphic Marbles
13.8. Discussion
Acknowledgments
References
Chapter 14. Towards Colloidal Droplet Processors
14.1. Background
14.1.1. Data throughput
14.1.2. Heat dissipation
14.1.3. Energy consumption
14.2. Liquid Droplet Computer
14.2.1. Holonomic processors
14.3. Colloidal Processors
14.3.1. Phase change architectures
14.3.2. Microfluidic circuits
14.3.3. Layered shells
14.3.4. Granular media, foams, and plasmas
14.4. Biological Processors
14.5. Conclusions
Acknowledgment
References
Chapter 15. Biomolecular Motor-based Computing
15.1. Introduction
15.2. Applications of Biomolecular Motors in Computing
15.2.1. Parallel computing using biomolecular motors in nanofabricated networks
15.2.2. Computing using swarm robots prepared from biomolecular motors
15.2.2.1. Design and construction of molecular robots
15.2.2.2. Swarming of molecular robots
15.2.2.3. Controlling the shape morphology of swarms of molecular robots
15.2.2.4. Remote control of the swarming of molecular robots
15.2.2.5. Logical operation of molecular robots
15.2.2.6. Orthogonal swarming of molecular robots
15.3. Conclusions and Future Perspectives
References
Chapter 16. Computing with Square Electromagnetic Pulses
16.1. Introduction
16.2. Background
16.2.1. Brief historical retrospective
16.2.2. Basics of transmission line theory
16.3. Method
16.3.1. Main computing elements: Cross-points
16.3.2. Four-point crossing: Catt’s junction
16.3.3. Scattering matrix approach
16.3.4. Multi-way junctions: Series and parallel
16.3.5. Simulation results
16.3.6. Scaling down geometries
16.3.7. Pulse generation and control requirements
16.4. Future Directions
16.4.1. 3D-based structures
16.4.2. Graph-based computing
16.4.3. More complex signals and materials
16.5. Conclusion
Acknowledgements
References
Chapter 17. Creative Quantum Computing: Inverse FFT Sound Synthesis, Adaptive Sequencing and Musical Composition
17.1. Introduction
17.2. Why Quantum Computer Music?
17.3. Algorithmic Music
17.4. qSyn: Inverse FFT Sound Synthesis
17.5. qSeq: Adaptive Musical Sequencer
17.6. The Composition Zeno
17.7. Concluding Remarks
Appendix
Acknowledgments
References
Chapter 18. Logical Gates in Natural Erosion of Sandstone
18.1. Introduction
18.2. Modeling of Natural Erosion
18.3. Specific Parameters
18.4. Gates
18.4.1. AND gate
18.4.2. XOR gate
18.4.3. One-bit half-adder
18.5. Discussion
Supplementary Materials
References
Chapter 19. A Case of Toy Computing Implementing Digital Logics with “Minecraft”
19.1. Introduction
19.2. History and Theory of Toy Computing
19.3. Digital Logics in “Minecraft”
19.3.1. Signal generation and transportation
19.3.2. NAND gates
19.4. Digital Circuits in “Minecraft” and the Remains of Electronics
19.4.1. The circuit
19.4.2. The 4-bit full adder build with TTL-logic
19.4.3. The 4-bit full adder build in “Minecraft”
19.5. Project Discussion
19.5.1. Building a BCD decoder in “Minecraft”
19.5.2. The computational labyrinth
19.5.3. Walking the way of the signal
19.6. Summary
Chapter 20. Evolving Conductive Polymer Neural Networks on Wetware
20.1. Introduction
20.2. Introduction of Polymer Neural Networks
20.3. Electropolymerisation of Conducting Polymer Wire
20.3.1. Polymer wire growth
20.3.2. Wire diameter and growth rate
20.3.3. Conductance increase of wires
20.3.4. Directional growth of wire and 3D growth
20.4. Machine Learning for Polymer Wire Growth
20.4.1. Supervised learning: Simple perceptron for simple logic gates
20.4.2. Unsupervised learning: Autoencoder for feature extraction
20.5. Conclusions
References
Index

Citation preview

Handbook of

Unconventional Computing VOLUME 1

12232 9789811235719v1 tp.indd 1

Theory

2/8/21 10:01 AM

WSPC Book Series in Unconventional Computing Print ISSN: 2737-5218 Online ISSN: 2737-520X

Published Handbook of Unconventional Computing (In 2 Volumes) Volume 1: Theory Volume 2: Implementation edited by Andrew Adamatzky

Steven - 12232 - Handbook of Unconventional Computing.indd 1

2/8/2021 9:17:58 am

W S P C B O O K S E R I E S I N U N C O N V E N T I O N A L C O M P U T I N G In 2 Volumes

Handbook of

Unconventional Computing V O L U M E

1

Theory

Editor

Andrew Adamatzky University of the West of England, Bristol, UK

World Scientific NEW JERSEY

•

LONDON

12232 9789811235719v1 tp.indd 2

•

SINGAPORE

•

BEIJING

•

SHANGHAI

•

World TAIPEI Scientific CHENNAI TOKYO

HONG KONG

•

•

•

2/8/21 10:01 AM

Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.

WSPC Book Series in Unconventional Computing HANDBOOK OF UNCONVENTIONAL COMPUTING (In 2 Volumes) Volume 1: Theory Volume 2: Implementations Copyright © 2022 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.

ISBN 978-981-123-503-0 (set_hardcover) ISBN 978-981-123-526-9 (set_ebook for institutions) ISBN 978-981-123-527-6 (set_ebook for individuals) ISBN 978-981-123-571-9 (vol. 1_hardcover) ISBN 978-981-123-572-6 (vol. 1_ebook for institutions) ISBN 978-981-123-573-3 (vol. 2_hardcover) ISBN 978-981-123-574-0 (vol. 2_ebook for institutions) For any available supplementary material, please visit https://www.worldscientific.com/worldscibooks/10.1142/12232#t=suppl Typeset by Stallion Press Email: [email protected] Printed in Singapore

Steven - 12232 - Handbook of Unconventional Computing.indd 2

2/8/2021 9:17:58 am

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-fm

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 fmatter

Preface

Progress resides on explorations. An exploration is the departure from, or challenge of, traditional norms, and assessment of other possibilities or choices available. This book uncovers a wide range of amazing paradigms, algorithms, architectures and implementations of computation often being far outside the comfort zone of mainstream, conventional, computing sciences. There are two volumes. One deals with mostly theoretical results and algorithms, another with experimental laboratory implementations or computer models of novel computing substrates. Distribution of chapters between volumes is often subjective because a majority of the chapters well fit into both “theoretical” and “implementation” categories. The first volume overviews topics related to the physics of computation, theory of computation, information theory and cognition, and evolution and computation. The boundaries between the topics are fuzzy and many chapters cover more than one topic. Physics of computation deals with analog computation, quantum computation, field computation, programmable matter, artificial morphogenesis, reversible logic elements with memory, the interplays between the costs of realising physical computation and the computing capacity one can harvest from a physical device, and physical randomness in computation. Information theory and cognition is covered by the estimations of integrated information (information generated by a system beyond the information generated by its individual elements) using complexity measures and architectures for adding narratives for robot cognition, including an experimental scenario for investigating the narrative hypothesis in a combination of physical v

page v

August 2, 2021

vi

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-fm

Preface

and simulated robots. The theory of computation is presented by a study of the computational power of the scatter machine bounded in polynomial space, a review of exclusively quantum problems (those which a Turing machine cannot solve), an extensive overview of the membrane computing concepts, and insights into symmetric automata and computation. Other important topics include evolving Boolean regulatory networks with variable gene expression times, swarm and stochastic computing for global optimization, unsupervised learning on bio-inspired topologies, and probabilistic logic gates in asynchronous Game of Life. Topics presented in the second volume deal with computing in chemical systems, novel materials, biopolymers, alternative hardware and unclassed topics. Chemical computing includes mimicking Boolean logic gates with enzyme reactions, chemical oscillatory reactions, computation with polymerase strand displacement reactions, chemical automata, and training of chemical classifiers. The chapters on novel materials introduce organic memristive devices for bioinspired application, sensing and computing with liquid marbles, colloid droplet processors, and evolving conductive polymer neural networks. The biopolymers part of the volume is about biomolecular motor-based computing and optics-free DNA microscopy. Alternative hardware includes a hybrid computer approach to train a machine learning system, computing with square electromagnetic pulses, logical gates in natural erosion of sandstone, fully analog memristive circuits for optimization tasks, and wave-based majority gates with cellular automata. Other (unclassified) exciting topics are information processing in plants with hormones, creative quantum computing, and digital logic with Minecraft. All chapters are self-contained and accessible by a reader with a basic training in exact sciences. The treatise in alternative computing appeals to everyone — from high-school students to university professors, from mathematicians, computists and engineers to chemists, biologists, and material scientists. Andrew Adamatzky July 2021

page vi

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-fm

page vii

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 fmatter

Contents

Preface Chapter 1.

v Mapping the Territory of Computation Including Embodied Computation

1

Bruce J. MacLennan Chapter 2.

Reversible Logic Element with Memory as an Alternative Logical Device

31

Kenichi Morita Chapter 3.

Space Bounded Scatter Machines

59

Jo˜ ao Alves Al´ırio, José Félix Costa and Lu´ıs Filipe Fonseca Chapter 4.

Exclusively Quantum: Computations Only Possible on a Quantum Computer Selim G. Akl

vii

99

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Contents

viii

Chapter 5.

b4205-v1-fm

Estimations of Integrated Information Based on Algorithmic Complexity and Dynamic Querying

171

Alberto Hern´ andez-Espinosa, Hector Zenil, Narsis A. Kiani and Jesper Tegnér Chapter 6.

Robot Narratives

221

Marina Sanz Orell, James Bown, Susan Stepney, Richard Walsh and Alan F. T. Winfield Chapter 7.

Evolving Boolean Regulatory Networks with Variable Gene Expression Times

247

Larry Bull Chapter 8.

Membrane Computing Concepts, Theoretical Developments and Applications

261

Erzsébet Csuhaj-Varj´ u, Marian Gheorghe, ´ Alberto Leporati, Miguel Angel Mart´ınez-del-Amor, Linqiang Pan, Prithwineel Paul, Andrei P˘ aun, Ignacio Pérez-Hurtado, Mario J. Pérez-Jiménez, Bosheng Song, Luis Valencia-Cabrera, Sergey Verlan, Tingfang Wu, Claudio Zandron and Gexiang Zhang Chapter 9.

Computing with Modest Resources: How to Have Your Cake and Eat it Too Vasileios Athanasiou and Zoran Konkoli

341

page viii

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Contents

Chapter 10. Physical Randomness Can Help in Computations

b4205-v1-fm

page ix

ix

363

Olga Kosheleva and Vladik Kreinovich Chapter 11. Probabilistic Logic Gate in Asynchronous Game of Life with Critical Property

375

Yukio-Pegio Gunji, Yoshihiko Ohzawa and Terutaka Tanaka Chapter 12. A Mystery of Human Biological Development — Can It Be Used to Speed up Computations?

399

Olga Kosheleva and Vladik Kreinovich Chapter 13. Symmetric Automata and Computations

405

Mark Burgin Chapter 14. Computation By Biological Means

445

Alexander Hasson and Dan V. Nicolau Chapter 15. Swarm and Stochastic Computing for Global Optimization

469

Xin-She Yang Chapter 16. Vector Computation Karl Svozil

489

August 2, 2021

16:19

x

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-fm

Contents

Chapter 17. Unsupervised Learning Approach Using Reinforcement Techniques on Bio-inspired Topologies

507

Karolos-Alexandros Tsakalos, Georgios Ch. Sirakoulis and Andrew Adamatzky Chapter 18. Intelligent Gamesourcing — Artificial Intelligence in Problem Solution by Game Playing

535

Ivan Zelinka, Jiri Arleth, Michal Bukacek and Tran Trong Dao Index

569

page x

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 0001

Chapter 1

Mapping the Territory of Computation Including Embodied Computation Bruce J. MacLennan Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, Tennessee 37996, USA [email protected] Investigation of alternatives to conventional computation is important both in order to have a comprehensive science of computing and to develop future computing technologies. To this end, we consider the full range of computational paradigms that is revealed when we relax the familiar assumptions of conventional computation. We address the topology of information representation, the topology of information processing, and unconventional notions of programmability and universality. The physics of computation is especially relevant in the post-Moore’s law era, and so we focus on embodied computation, an alternative computing paradigm that focuses on the physical realization of computation, either making more direct use of physical phenomena to solve computational problems, or more directly exploiting the physical correlates of computation to implement some intended physical process. Examples of the exploitation of physical processes for information processing include analog computation, quantum computation, and field computation. Examples of the use of embodied computation for physical purposes include programmable matter and artificial morphogenesis.

1.1. Unconventional Computation Unconventional computation, non-standard computation, alternative computation: I take these to be synonyms, but what do they mean? They are all negative terms, defined more by what they are not than by what they are; they refer to computation that is not conventional or standard, or that is an alternative to what is conventional and 1

page 1

August 2, 2021

2

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

B. J. MacLennan

standard. Therefore, a computing paradigm may be “alternative” by deviating from the common characteristics of computing as we have come to know them. Among these are binary digital electronics, sequential electronic logic, von Neumann architecture, discrete data representation, discrete memory units randomly addressable by natural numbers, modifiable memory, programs stored in this memory, sequential execution from memory, deterministic processing, conditional and iterative control mechanisms, and hierarchical program organization. You can no doubt think of more, and some otherwise quite conventional computing systems might lack one or another of these characteristics, but they indicate possible directions for alternative computing paradigms. The conventional computing paradigm has been wildly successful, and it is reasonable to question the value of investigating alternatives, but there are at least two motivations. The first motivation, and I think the most important, is scientific. Computer science is the science of computing and so it should investigate computing in all its manifestations, artificial and natural. Restricting attention to conventional computation would be akin to biologists restricting their study to bacteria (because they are so numerous) while ignoring all other living things. Computation may be defined as a physical process with the function or purpose of processing information (see Refs. [1, 2] for more on distinguishing computing from other physical processes). Throughout history, many computational techniques have been developed, including manual arithmetic, slide rules, abacuses, and similar devices, but also geometric constructions and tools (e.g., pantographs). Also included are formal logical techniques, including logical calculi and various sorts of logic diagrams. Over the centuries machines have been designed to do more-or-less automatic computation, including (electro-)mechanical arithmetic calculators, mechanical analog integrators and Fourier analyzers,3, 4 and (electro-)mechanical analog differential analyzers.5, 6 In the realm of electronic computation, we have the modern digital computer, but also analog computers, which were once as common as digital machines and are returning to importance for some applications,

page 2

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

Mapping the Territory of Computation Including Embodied Computation

page 3

3

such as neural networks.7 It behoves us to study all the ways computing has been done in the past, both to understand it more completely, but also to better understand the possibilities for future, alternative computing technologies. Computer science also investigates the manifestations of computing and information processing in nature, not only in the brains of individual animals, but also in the “group brains” of social animals (including human beings). Computational principles are involved in the cooperation and coordination of flocks of birds, schools of fish, and herds of land animals. Social insects (and much simpler organisms, such as slime molds) solve optimization problems and construct complex nests and colonies simultaneously fulfilling multiple functions. Evolution by natural selection is itself information processing, searching a complex space for designs with selective advantage. The discipline of natural computation studies these instances of computation in nature, but also takes inspiration from them to develop future computing methods. It is apparent that computation in nature has little in common with conventional computation: it is rarely binary or even digital, but more often analog; it is rarely strictly sequential and more commonly asynchronous and parallel; typically it does not separate memory and processing or make use of stored programs; it is often nondeterministic; it operates reliably and robustly under conditions of significant and unavoidable uncertainty, errors, defects, faults, and noise; and so on.1 Nevertheless, natural computation is very effective; it has facilitated the evolution of innumerable species, including our own. This demonstrates that computing is much more than our familiar digital von Neumann devices. Therefore, in Section 1.2, we will consider computation in the broad sense. Another important motivation for investigating alternative computing is the inevitable end of Moore’s Law. Although this empirical law results from a complex interaction of technological and economic factors, it is apparent that it cannot continue forever. Due to the atomic structure of matter and the laws of physics, there is a limit to the smallness, density, and speed of electronic components. The semiconductor industry continues to squeeze out incremental

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

4

b4205-v1-ch01

B. J. MacLennan

improvements, but they are coming at an increasing cost and the end is in sight. A more distant, but harder barrier is posed by the von NeumannLandauer bound, which arises from the fact that information must be represented physically.8 As a consequence, erasing, destroying, or losing a bit of information must dissipate a certain minimum amount of energy, specifically, kB T ln 2, where kB is Boltzmann’s constant and T is ambient temperature; this minimum energy is about 18 meV at 300 K. The von Neumann–Landauer limit, which was originally established theoretically, has been recently confirmed empirically.9 Therefore, any computational operations that forget information (in either computation or control) must dissipate at least this much energy, which usually appears as heat. The only way to avoid this limitation is to avoid discarding information during computation, which is accomplished by reversible computing, an alternative computing paradigm that has been explored both theoretically10–13 and practically.14 1.1.1. Implications Conventional computing technology is built upon many cleanly separated levels of abstraction. Primitive data abstractions, such as integers, characters, and real numbers, are implemented in terms of bits, which are realized by physical devices with continuous state spaces but operated in saturation, so as to simulate binary states. Primitive data processing operations, such as addition, division, and comparison, are realized by sequential digital logic, which is realized by devices obeying continuous physical laws, but switching quickly between saturated states. This clean separation of abstraction levels has facilitated independent progress on each level. For example, the same basic sequential digital logic has survived while switching technology has progressed from mechanical, to electromechanical (relays), to vacuum tubes (valves), to discrete transistors, to integrated circuits, to very large scale integration. Algorithms (e.g., Newton’s algorithm, sorting algorithms) that ran on the earliest computers can be and are run on the latest computers. Therefore,

page 4

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

Mapping the Territory of Computation Including Embodied Computation

page 5

5

progress on these various levels has not required us to abandon the accumulated technological knowledge on other levels, which has enabled rapid progress in computer science. Unfortunately, we are reaching the limits of this hierarchical computing technology, with its many levels between our programming abstractions (embodied, e.g., in high-level programming languages) and the physical laws governing our basic computing devices. To achieve greater component densities and speeds, the number of layers of abstraction must be decreased, since each layer introduces an “abstraction cost”. The only way to accomplish this, since the laws of physics are invariable, is to bring our programing abstractions closer to the underlying physics.15 That is, our programming abstractions should be more like the physical processes that realize them. The laws of physics are continuous — differential and partial differential equations — and therefore one implication of this increasing assimilation of computing to physics is a greater dependance on analog models of computation, that is, computation that is continuous in state space and perhaps also dynamics. Analog computing avoids the inefficiency of implementing continuous computation in terms of digital computation that is in turn realized by continuous physical processes. This conclusion applies as well to quantum computing, which is founded on a continuous wave function, continuous (complex valued) superpositions of basis states, and continuous dynamics (Schr¨ odinger’s equation). The clean separation of levels of abstraction has depended on the accuracy of simulation at each level; for example, our digital switches really behave like perfect switches. This has been facilitated by the largeness of our devices (compared to atomic dimensions), by redundancy in our digital state representations, and by the relative slowness of our digital processes compared to the underlying physics (e.g., allowing processes to reach saturation in much less than a clock cycle). Our technology has been built on a stack of idealized abstractions, in which the idealizations have been close enough to reality to be reliable. As we push toward smaller devices and higher densities, however, and toward higher speeds, these idealizations break down.

August 2, 2021

6

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

B. J. MacLennan

In particular, noise, inaccuracy, imperfections, faults, uncertainty, and other deviations from our idealized models will be unavoidable. Instead of striving to completely eliminate them (which would increase cost and inefficiency), we should embrace these inevitable physical phenomena as sources of free variability, which can be exploited for computational purposes (e.g., escape from deadlock, exploration, non-determinism). Natural computation, that is, computational processes in nature, teaches us ways to use this free variability, since noise and error are unavoidable in nature, and living systems, which are imperfect, have evolved to survive under these circumstances. 1.2. Computation in General Since we will need to “think outside of the Boolean box” to develop future alternative models of computation, it will be worthwhile to explore the landscape and boundaries of the idea of computation. We will consider first the range of possibilities for information representation, and second the variety of dynamical processes by which information might be processed. 1.2.1. Topology of information representation Rolf Landauer coined the slogan “Information is physical” to remind us that information must be embodied in physical reality (at least if it is to be used in any way), and therefore that information is not independent of physical properties and limitations.8 (I have already mentioned the von Neumann–Landauer limit as an example.) In order to explore the physical nature of information, it is convenient to use the Aristotelian distinction of form and matter (hylomorphism).15 For our purposes, the form (Grk. morph¯e, eidos; Lat. forma) is some discernable or discriminable arrangement or structure of an underlying medium or substrate, the matter (Grk. hul¯e ; Lat. materia), which might be physical matter or energy. Form and matter are relative terms, in which form refers to physical properties that are intended or used to represent the information, and matter refers to those physical properties that are not. Although

page 6

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

Mapping the Territory of Computation Including Embodied Computation

page 7

7

information must be instantiated in some physical substrate, it is, qua information, independent of the matter and exists only in the form. This independence leads to the misleading idea of disembodied or non-physical information, to which Landauer objected. On the other hand, information is multiply realizable in that the same or an equivalent form can be realized in various material substrates, so long as they support the fundamental differences of form required to represent the information. (Thus, information is potentially separable from matter, but in actuality always materially realized.) Most material substates possess a large number of physical properties, many degrees of freedom, only some of which are used to represent information. Therefore it is useful to distinguish information-bearing degrees of freedom (IBDF ) from noninformation-bearing degrees of freedom (NIBDF ).13 The distinction is grounded in the information processing, in the computational processes to which the information is subject. Which are relevant to the computation? Which irrelevant? For an example, we may consider a simple electrical implementation of a bit: if all the free electrons are on one plate of a capacitor, it represents a 0; if they are on the other plate, a 1. This is the IBDF; the positions and velocities of the electrons within the plate are NIBDF, for they are irrelevant to whether a 0 or 1 is represented. The relevant distinctions of form that constitute an information space can be expressed often in terms of the topology of the space. For example, a conventional digital computer operates on the discrete topology on {0, 1} (or any homeomorphic space); an analog computer might operate on continuous variables with states in the continuum [0, 1] (or spaces homeomorphic to it). Another might operate on a bounded region of the complex plane. These all are supported by a variety of physical realizations. More generally, the topology of an information space defines relationships of similarity and difference, of approximation and distance. It is the background on which information processing and control takes place. Second-countable metric spaces (i.e., metric spaces with a countable base) are a useful framework for many models of computation, for they include both discrete spaces and continua.16

August 2, 2021

8

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

B. J. MacLennan

Information often has a complex constituent structure, which is exemplified by the data structures used in conventional computer programming. In general, the topology of these composite information spaces is in some relevant sense the product of the topologies of its constituent spaces. In conventional digital computation, the components of a data structure are discrete, for example, the homogeneous elements of an array or the heterogeneous elements of a structure (record); memory in a von Neumann machine is an indexable array of discrete bytes or words. Many analog computers similarly operate on a discrete set of continuous variables or on vectors or matrices of continuous variables, and so they have a discrete constituent structure. However, analog computers have also made use of continuous distributions of continuous information, that is, of continuous fields of data. Early analog computers used a field analogy method for solving partial differential equation problems [6, p. 34], and fields provide a model for optical and quantum computing technologies.17–20 For these information spaces, Hilbert spaces often provide the relevant topology.21 In general, information spaces may be function spaces on discrete or continuous domains. 1.2.2. Topology of information processing Information is physical, as Landauer said, and therefore information processing is also physical, which gets at the heart of computing. Computation is a physical process, but what distinguishes it as computation from other physical processes is that (in hylomorphic terms) it depends only on the form of its state and not on its matter to fulfill its purpose1, 2; we may call these information processes. Therefore, an information process (computation) is multiply realizable, since it can be realized by any physical system that supports the formal distinctions and transformations used in the computation. In general, a computation may be considered a dynamical system coupled with its environment. The state space of the composite system is a product of the external state space (at least as manifest in the interface between the coupled systems) and the computational state space; either may be discrete or continuous. If the state space is discrete, then the dynamics is necessarily discrete as well. Often in a

page 8

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

Mapping the Territory of Computation Including Embodied Computation

page 9

9

discrete-time dynamical system, the state changes as fixed intervals, perhaps controlled by a clock, but another possibility is sequential dynamics, in which the sequence of states is defined, but not the state transition times.22 Conventional digital computer programs exhibit sequential dynamics, for the sequence of operations is defined, but there is no presumption that they take equal amounts of time. If, on the other hand, the state space is a continuum, then the dynamics can be either continuous or discrete (including sequential). In all these cases, the information relationships within the computational system constrain the dynamics of the composite system to fulfill some function or purpose. Within this broad framework, there are many alternatives. Are states strictly ordered or only partially ordered? Are state changes deterministic or probabilistic? Are they reversible (as in Brownian and quantum computing)? In states with either a finite discrete set of components or with a continuum of components (such as a field), how are the dynamics of the components related? Sequential? Synchronous parallel? Asynchronous parallel? Stochastic? The dynamics of information processing in natural systems, non-living as well as living, suggest many possibilities. 1.2.3. Programmability Conventional computation is generally associated with the idea of programmability, but computers in general do not need to be programmable; we can have special purpose computers designed for a single computation. Nevertheless, programmability is important, for it allows a single computer to be easily reconfigured for different computational purposes, but we must consider programmability in a more general sense. There is a tendency now to define programmability in relation to the universal Turing machine or its equivalents (lambda calculus, etc.), but as will be discussed later (Section 1.2.4), alternative models of computation require alternative definitions of computability. If we think about the variety of ways that we use the verb “program”, it is apparent that it refers to a process by which the behavior of some system can be controlled by some abstract means. That is, a

August 2, 2021

10

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

B. J. MacLennan

more general class of possible behavior is restricted to some desired subclass by means of an abstract specification, the program. To put it in hylomorphic terms, the programmable system is the matter, which has a broad class of possible behaviors, from which a subclass is selected by imposition of a form, which is the program. The form (program) organizes the computational substrate or medium to have the desired dynamical behavior. The form becomes operative (is executed) by its embodiment in the computational medium. It is actualized (becomes active) by embodiment in an appropriate computational medium, otherwise it is only a potential program so long as its form is embodied in a non-computational medium (e.g., a piece of paper or a text file). We are most familiar with textual and diagrammatic representations of programs, that is, programs expressed in programming languages or flow diagrams. Both have a discrete constituent structure, which represents the dynamics in terms of basic computational processes. The space of possible programs for a computer is generally a formal language (over a finite, discrete alphabet) defined by a formal grammar. Of course, programs are often translated from one form to another, for example, from a program for an abstract computer to an equivalent program for a physical computer. Programs need not belong to discrete spaces, they can belong to continua. For example, the dynamics of a computation can be governed by a Hamiltonian function or a potential surface. In a simple case, the input is encoded in the initial state, and the output is encoded in a fixed-point attractor to which the system converges, governed by the potential function. Such continuous programs may be described in a discrete language; for example, a potential function might be defined by a finite set of equations, describing a problem Hamiltonian, as is done in quantum annealing and similar optimization techniques. In this case, the discrete program implicitly defines a continuous program. In other cases, a continuous program might be expressed directly; the appropriate metaphor might not be writing a program, but rather sculpting a program.1 More commonly, continuous programs will not

page 10

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

Mapping the Territory of Computation Including Embodied Computation

11

be created explicitly by some programmer, but will emerge from machine learning. 1.2.4. Universality Any discussion of alternative, unconventional computing paradigms must address the issue of Church–Turing computability, which has been the de facto definition of computability for the better part of a century. It is so familiar that it is difficult to imagine other definitions, and the assumptions on which it is built are largely forgotten. We must remember that Church, Turing, and the others who created the theory of computation were trying to formalize the notion of effective calculability in the foundations of mathematics. The assumptions built into the model (discrete symbols, exact matching, sequential dynamics, finiteness of representation and processing, etc.) were appropriate for the problems they were addressing, and it is quite remarkable that it has been applicable to conventional computing more generally. Nevertheless we must recall that the Turing machine is a model of computation, and like all models it is useful because it makes simplifying assumptions that are unproblematic for the domain of questions it is intended to address. We may call this domain of questions and issues the model’s frame of relevance.1, 23 Models generally give incorrect or misleading answers when applied outside of their frame of relevance or near to its boundaries, where its simplifying assumptions affect the answers. Near or beyond the boundaries, we are in danger of obtaining answers that have more to do with the model and its assumptions than with the system being modeled. Questions of computing power (e.g., whether a computing paradigm has the power of a universal Turing machine), depending on how they are framed (e.g., in terms of the class of functions that can be computed), might or might not be in a model’s frame of relevance. For example, the Church–Turing model is generally ill-equipped to answer questions about real-time performance and real-time emulation, which are relevant to some notions of computing power.

page 11

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

12

b4205-v1-ch01

B. J. MacLennan

A more pragmatic model of computational universality, which is sometimes used, may be termed logic-gate computational universality. It is based on the observation that all our conventional, von Neumann computers are constructed from a few logic gates, and therefore any computing paradigm that can implement arbitrary finite circuits of these gates can, in principle, do anything a conventional digital computer can do. (This is, of course, only an approximation to Church–Turing universality, which also requires an unbounded memory; conventional digital computers are finitestate machines.) This model is suitable for addressing questions of binary computation, but is less useful for questions relevant to unconventional computation (e.g., analog computing), which are outside of its frame of relevance. In natural computation and many other alternative computing paradigms (see, e.g., Section 1.3), other notions of universality may be more relevant than the universal Turing machine.1 In general, we are asking what class of dynamical systems can be implemented in a particular computational medium (matter ) by a specified space of programs (forms). Since this dynamical system might be intended to interact with the external, non-computational environment, issues of real-time performance, accuracy, physical resources, and energy dissipation might be relevant. In other words, the question of whether a programmed system is “equivalent” to some hardware might not be a simple matter of computing the same class of functions; it might be more than this in some respects, and less in others. We must beware of being seduced by our familiar models and theories. 1.3. Embodied Computation 1.3.1. Definition If future progress in computation, especially post-Moore’s law computation, will require a greater assimilation of computation to physics, then we need to look deeper into the relation between computations and their physical realizations. As discussed above, computation has been viewed traditionally as an abstract process largely independent of its material embodiment. This parallels a

page 12

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

Mapping the Territory of Computation Including Embodied Computation

13

Cartesian approach to cognition, which treats it as information processing independent of the brain and of the body more generally. Cartesianism has been found inadequate in cognitive science, and embodied cognition approaches cognition from the perspective of information processes realized in a biological brain and with a principal function of controlling a physical body in its physical environment.24 When cognition is approached from this perspective, many problems become easier to solve. Neurological processes in the physical brain simplify some information processing tasks, as does the fact that it is controlling a body with specific physical properties. Embodied computation applies these insights to information processing more generally by thematizing and exploiting the relation between computation and its physical realization. On one hand, it makes more direct use of physical processes to implement information processes, thereby achieving a closer assimilation of computation to physics. On the other, it provides a framework for understanding and designing systems in which the goal is not information processing per se, but in which information processes are essential to some intended physical process. With these considerations in mind, we have proposed the following definition of embodied computation: Embodied computation may be defined as computation in which the physical realization of the computation or the physical effects of the computation are essential to the computation.25

(Reference [26] uses “embodied computation” in a different, but related sense. Other related ideas are material computation and in materio computation.27 ) We consider first the use of physical processes for computational purposes, which is more familiar and easier to understand, and then we turn to the use of computational processes for physical purposes. 1.3.2. Physics for computational purposes Embodied computation can exploit physical processes for more direct and effective realization of a computational process when the physical process has a mathematical structure that is the same or closely related to the computational process. This is of course

page 13

August 2, 2021

14

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

B. J. MacLennan

analog computation in the original sense, in which a target system is simulated by a more convenient analogous system with the same mathematical structure. This is especially advantageous when the physical system has a very large number of information-bearing elements (e.g., atoms) or when the information is represented by a continuous field, for in these cases the simultaneous interaction of the spatial elements can directly implement what would be a very expensive computation on a conventional computer (e.g., solving a system of PDEs). Since all computation must be physically realized, the reader may wonder how embodied computation differs from conventional computation. It is in fact a matter of degree. Conventional computers provide a generic physical realization which is universally adequate for a broad class of computations (roughly, Turing-computable functions). These are the familiar electronic bits and binary operations discussed previously (Section 1.1). Since embodied computation depends on more direct physical realizations of computational processes, or because it is in direct physical interaction with its environment, the possible physical realizations may be more limited. For example, there may be few specific realizations that have an appropriate mathematical structure for a computation (e.g., obey appropriate PDEs) and that also operate at a rate suitable for environmental interactions. In other words, in embodied computation we cannot ignore physical realization to the same degree that we have in conventional computation, and different specific realizations might be more or less suitable for different embodied computations. A hallmark of conventional computation is multiple realizability, which means that, in principle, any computation can be realized on any computer that has the power of a universal Turing machine, and this independence of specific physical realization depends on the many levels of abstraction between computations and the physical processes implementing them. Multiple realizability is also important in embodied computation, and so we look for computational abstractions that can be realized — more directly than in conventional computation — by a variety of physical processes, thus expanding the range of usable specific realizations.

page 14

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

Mapping the Territory of Computation Including Embodied Computation

15

Therefore, one challenge in embodied computation is to identify or design physical processes that have the same mathematical structure as useful computations, while also having desirable physical characteristics (e.g., speed and controllability). It is not essential that the physical process have the same mathematical structure as the computation, so long as it can be easily “programmed” (in the sense of Section 1.2.3) to simulate the desired computation without many levels of abstraction. 1.3.2.1. Transduction Input and output transduction, which is a transformation between the generic computational realization and a specific input or output medium, is somewhat different in embodied computation than in conventional computation.1 This is easiest to understand in the context of conventional embedded computers, which will have a number of sensors and actuators that allow the computer to receive information from the physical system in which it is embedded and to have physical effects on that system. Sensors are specific to the form of matter or energy that they sense, and thus they transduce information with a specific physical realization into information internal to the computer, which is generically realized, that is, in principle realizable by any other appropriate physical process with the same mathematical structure. In contrast, the specific physical realization of the input cannot be changed, or the sensor will be sensing the wrong information. For example, an electronic analog computer might represent quantities by voltages; that is the generic computational realization. It is generic because it could, in principle, be changed to another realization (e.g., fluid pressure) and still accomplish the computer’s function. On the other hand, a light sensor has to detect light intensity; it cannot be changed to something else, such as temperature, without changing the function of the computer. The situation is similar for actuators: generically realized information in the computer is transduced into a specific physical form so that it can have the intended effect in the embedding physical system and environment. So the computational medium (e.g., voltage or fluid

page 15

August 2, 2021

16

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

B. J. MacLennan

pressure) is converted to a specific medium (e.g., mechanical force or light intensity) as required for its purpose. In an ideal transduction process, only the matter of the information is changed, not its form, as it is transferred from a specific realization to the computational realization or vice versa. In practice, there is also some change of form (for example, a continuous signal might be digitized or limited in range). We can think of ideal input transduction as the process of removing the units from a physical quantity and turning it into a pure number, and output transduction as the inverse process of turning a pure number into a physical quantity by applying appropriate physical units. In conventional computation, transduction has a bow tie organization, with a single computational realization being the target of multiple input transductions and the source of multiple output transductions. The issue of transduction is more complex in embodied computation since we might not have the benefit of a single computational representation as either the target or the source of transductions. Rather, we will need to transduce physical input information into the specific physical realization that will be used for the computations to be applied to it. Conversely, output signals will need to be transduced from the physical realization of the computation that produced them to the appropriate physical realization of the output signals. 1.3.2.2. Analog computation Analog computation originally referred to computation in which some convenient physical system was used as a model to simulate some target system of interest. However, since most analog computers (whether mechanical, electrical, or some other medium) used continuous physical processes, and because they were most often used to model physical systems obeying continuous laws (expressed as systems of ordinary or partial differential equations), the term “analog computation” soon came to refer to computation in continuous media, as opposed to “digital” computation, implemented by discrete (generally binary) processes. They are more properly called continuous computation and discrete computation, respectively.

page 16

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

Mapping the Territory of Computation Including Embodied Computation

17

Mathematically, analog computers are continuous-time dynamical systems, typically defined by a system of ordinary or partial differential equations (ODEs or PDEs). When applied to more conventional computational problems, the inputs may be encoded in the initial state of the system, and the corresponding outputs are encoded in the final states to which they converge (attractors). All analog computation is physically realized, but we call it embodied to the extent that there is an ongoing interaction between the computation and its physical environment, in which inputs may define boundary conditions for the computation or define a subset of extrinsic variables, and a subspace of the internal state space determines the outputs. Analog computing, therefore, applies physical processes to computation by identifying or implementing a dynamical system that is able, relatively directly, to solve the computational problem. There are many examples of such dynamical system solutions, even applied to discrete combinatorial optimization problems, such as Boolean satisfiability.28, 29 The challenge for embodied analog computation is to find a physical dynamical system that can be easily configured to solve the problem. The problem of finding or designing a physical system to realize a desired dynamical system points to the need for general-purpose analog computers (GPACs), as was also apparent in an earlier generation of analog computation.6 Ideally, we would like a model of universal analog computation comparable to the universal Turing machine (UTM) in conventional computation, but it does not yet exist. Embodied computation is outside of the frame of relevance of the Church–Turing model of computation, and so the UTM is not a very useful basis for a universal analog computer (see Section 1.2.4).1 One promising approach to universal analog computation and GPACs is provided by various universal approximation theorems in mathematics, which typically establish how compositions of a restricted set of basis functions can be used to approximate as closely as required a given function in a large and interesting class.17, 18, 30 Polynomial and Fourier series approximations are familiar examples, but not necessarily the most useful for GPACs. Already in the 1940s Claude Shannon proved theorems establishing the computational

page 17

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

18

b4205-v1-ch01

B. J. MacLennan

capabilities of GPACs inspired by the (mechanical) differential analyzer,31, 32 which were later corrected.33–36 More useful perhaps are universal approximation theorems that show how GPACs can be designed around sigmoidal neural networks and radial basis function networks [30, pp. 166–168, 219–220, 236–239, 323–326]. 1.3.2.3. Quantum computation Quantum computation is a paradigmatic example of embodied computation, for it makes direct use of the phenomena of quantum mechanics to perform computations that would be prohibitively expensive to solve on conventional computers. This is because quantum computers have the potential to perform an exponential number of conventional computations in quantum superposition. In addition to the digital computer-inspired “circuit” or “logicgate” model of quantum computation,37 there are also models that treat the complex amplitudes of quantum states as continuous variables, thereby providing a kind of analog quantum computation.38 Hilbert spaces provide the mathematical framework for quantum computation. 1.3.2.4. Field computation Conventional computers have a discrete address space comprising discrete variables (bits, bytes, words) in a regular array, typically indexed by natural numbers. As discussed in Section 1.2.1, some alternative models of computation provide fields, that is, continua of continuous quantity. In principle, individual values are indexed by real numbers, but in practice field computation operates on entire fields in parallel.17, 18 Mathematically, fields are treated as functions belonging to Hilbert spaces, field operations are (possibly nonlinear) Hilbert space operators, and dynamical systems are defined by PDEs.21 Embodied field computation makes use of physical processes operating on physical fields and defined by PDEs. Computational fields may be realized by physical fields that are literally continuous (such as electromagnetic fields) or by phenomenological fields, such

page 18

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

Mapping the Territory of Computation Including Embodied Computation

19

as fluids, which are composed of discrete elements, but of sufficient number and density to be treated as continua. Many physical processes, including electrical, optical, and chemical processes, are described by PDEs operating on fields, and they are potential realizations of field computation. For example, physical diffusion can be used for broadcasting information, optimization, and parallel search.39–44 Large populations of microorganisms (e.g., bacteria, slime molds) can be used to solve some problems (rather slowly).45 Indeed, quantum computation is a kind of field computation implemented by linear operations on the wave function. Functional analysis provides a series of universal approximation theorems that are a basis for programmable general purpose field computers.18 These include approximation by Taylor series over Banach spaces, generalized Fourier series, and methods analogous to convolutional neural networks.18, 21 1.3.3. Computation for physical purposes We have seen how the idea of embodied computation suggests ways that the physical realization of a computation can be exploited to better fulfill its computational purposes. Here we will shift the focus to see how computation can be used to better fulfill the purposes of some physical system. What makes a physical system an example of embodied computation is that it uses computational concepts and techniques in an essential way in order to achieve desired physical behavior and effects. Computation is a physical process, but in these cases it is the physical process that fulfills the purpose of the system, and the computation is a means to that end. In hylomorphic terms, the computation’s formal relations evolving in time impose a desired physical process on a relatively unstructured material substrate. In other words, when a computation takes place, matter and energy are transported and transformed within a physical system, such as a computer. In embodied computation we make use of this fact to achieve some desired series of physical states by means of the computations that are realized by them.

page 19

August 2, 2021

20

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

B. J. MacLennan

These applications of embodied computation may sound like embedded computation, in which a computational system is a part of a physical system, which it helps to control, but in embedded computation the computer has its own physical realization (e.g., in electronic circuits) separate from the physical system it is controlling. In embodied computation, the physical realization of the computation is the system being controlled. There is no distinction between controller and controlled. Reaction-diffusion systems provide a simple example of embodied computation for physical effect.46 They are fundamentally mathematical systems: a system of 2D PDEs combining diffusion with nonlinear reaction terms. Under a variety of conditions, studied by Turing, they evolve into stable arrangements of spots and stripes now called Turing patterns.47 As computational systems they can be realized in a variety of media, but when they are realized by morphogens in the skin of a developing embryo, they lead to the characteristic hair color and skin pigmentation patterns of various species (the leopard’s spots, the tiger’s stripes, etc.).48–50 Therefore, if for some application we want to arrange matter or energy in a Turing pattern, we can do this by using the required materials to realize an appropriate reaction-diffusion system. More generally, developing embryos provide many examples of embodied computation for physical effect. The proliferating cells in an embryo communicate and coordinate with each other to control their movement, differentiation, and adhesion to create an animal’s complex physical body.51 The communication and coordination of the cells is an information process because it could, in principle, be realized by different physical substances and still fulfill its function. Other examples of naturally occurring embodied computation for physical effect include the communication and control within social insect colonies by which they organize their group behavior and construct their nests.50 Embodied computation for physical applications has different tradeoffs and requirements than conventional computation (including embedded computation). For example, conventional computation and embodied computation applied to information processing both

page 20

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

Mapping the Territory of Computation Including Embodied Computation

21

exhibit some degree of multiple realizability: the physical realization does not matter too much so long as it supports the required computations. (We don’t care what matter and energy is moved around so long as the pattern of its movement realizes the computation.) With embodied computation, the computation must be realized by physical processes that are appropriate for the application, which will often dictate the physical realization. When the purpose of computation is information processing, we usually want it to execute as quickly as possible, with a minimum expenditure of energy and other resources, which are just physical means to the computational end. Therefore, progress in computer technology has been measured by a decrease in the amount of matter and energy involved in basic computational operations: from relatively massive relay armatures, to the substantial currents in vacuum tubes and the write currents for ferrite cores, to regularly decreasing operating voltages and charges in semiconductor devices (a major factor in Moore’s Law). Much of the technological progress in computing has been directed at representing bits and implementing bit operations with less matter and energy. When embodied computation is used for physical applications, however, we may want to move more matter and energy rather than less, since the system itself or the desired physical effects may be large. Physically bigger bits, for example, may be more suitable to the application.

1.4. Programmable Matter Embodied computation for physical effect is exemplified by programmable matter, in which computational methods are used to control the physical properties of a material.52, 53 Programmable matter has many potential applications, including in radically reconfigurable systems. Reconfigurable systems are valuable because they allow a system to be adapted, for example, for a new mission, new circumstances, or for damage recovery, instead of being replaced, which may be economically or physically infeasible. A conventional reconfigurable system may be reconfigured by changing the connections among a fixed set of components, but

page 21

August 2, 2021

22

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

B. J. MacLennan

the range of reconfiguration is limited by the builtin components. Examples include field-programmable gate arrays (FPGAs) and fieldprogrammable analog arrays (FPAAs). A radically reconfigurable system goes beyond this by allowing the physical properties of the components to be changed.54 That is, rather than rearranging a fixed set of hardware resources, radical reconfiguration changes the hardware resources. This requires systems whose physical properties are under programmatic control. One way to accomplish limited radical configurability is by implementing a random-access configuration memory with cells whose (possibly non-binary) states have distinct physical properties (e.g., conductance, reflectance, capacitance, switching, amplification, mechanical force). Then, as a program runs with at least some of its storage in the configuration memory, the properties of the cells will change under program control. We may call this externally programmed configuration or assembly, because the configuration memory is controlled by a separate program execution unit. Of course, the same can be accomplished by a conventional computer connected to the configuration memory as an I/O device, but then the computation would not be embodied; to be embodied the configuration memory should have an important role in the computational process (a matter of degree, of course). Even though such a system gives some control over the physical properties of a system, it is limited by the state space of the memory cells, a function of their design, and we might wonder if there is a way to programmatically control a potentially unbounded variety of physical properties. This might seem unlikely, but nature has provided an existence proof: proteins.25, 54 Among many other things, proteins include the keratin of feathers, horns, and nails, the elastin and collagen of connective tissue, the tubulin that forms the cellular cytoskeleton (enabling rigidity and movement), and signaling molecules, ion channels, and enzymes. There are also active proteins, such as the rhodopsin, which senses light, motor proteins, proteins that make decisions by changing their conformation, and of course proteins that correct errors in DNA, and transcribe RNA to assemble other proteins. In summary, proteins are an effectively infinite class

page 22

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

Mapping the Territory of Computation Including Embodied Computation

23

of compounds with a very wide range of active and passive physical properties. What is also important for our purposes is that they are all produced by a common process from a limited set of basic components (20 amino acids). Two other factors give proteins their generality and diversity. First, they are composed of long chains of amino acid residues, which relax into complex conformations that determine their physical properties (and hence their chemical and biological properties). It is the complexity of protein folding that gives them their enormous diversity of properties. Second, the amino acid sequences are encoded in DNA, which allows general programmability of the proteins’ structure and facilitates a common set of information (computational) processes for exploring the space of DNA sequences (e.g., mutation, crossover, deletion, transposition, duplication, explored through evolution by natural selection). So the combinatorial diversity in DNA sequences translates to diversity of physical behavior in the proteins. These ideas can be applied to artificial protein-like molecules as well. The first requirement is a class of polymers composed of a small fixed set of components so that the chains relax or fold into complex configurations with a very wide variety of physical properties. Second, we require a combinatorially rich data structure, such as a string, that represents the component sequence, and a means for translating the string into physical polymers (the embodied part of the computation). Aping biology again, the space of possible sequences can be searched by genetic algorithms, for example, to find polymer sequences that fold into useful structures. Radical reconfiguration, then, is able to change the sequences to assemble polymers with the required properties.

1.5. Artificial Morphogenesis Proteins and protein-analogs use an information structure to determine a polymer chain, which then folds passively under physical forces into a configuration with functional properties. The resulting individual molecules may self-assemble into relatively homogeneous tissues or other materials with a statistically regular structure.

page 23

August 2, 2021

24

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

B. J. MacLennan

For more complex heterogeneous and irregular structures, we can use externally programmed assembly, but the resulting structures will be limited by the fixed physical arrangement of the cells, whose states are under program control. For more complex structures, we may turn to internally programmed assembly, in which the individual components need not react purely passively, but can have their own behavioral program by which they self-assemble into the required structure. An example of this approach is algorithmic assembly by DNA.55–57 For a general and robust approach for assembling more complex physical structures, especially those with a hierarchical structure spanning many length scales, perhaps from microns up to meters, we can look again to biology for inspiration because developing embryos accomplish this, coordinating an ever-increasing number of cells (ultimately in the trillions) to coordinate their movement, differentiation, and interaction to produce reliably a specific complex body form. This is certainly an example of embodied computation for physical effect, since the goal of the information processes embodied in the cells’ interactions is for them to assemble themselves into a physical body. In embryology morphogenesis refers to the process by which embryos develop 3D structures, and so the technological application of these ideas may be termed artificial morphogenesis or morphogenetic engineering. Projects investigating this technology have taken a variety of approaches, often differing in how closely they model the biological processes.58–67 Our approach to artificial morphogenesis follows a common embryological practice of using PDEs to describe morphogenetic processes. This is appropriate in biology because of the large number and small size of cells compared to the tissues they constitute, and also because developing tissues often have the characteristics of soft-matter (visco-elastic materials), which may be described by continuum mechanics. In artificial morphogenesis, PDEs have the advantage of being suited to describing processes involving very large numbers of very small agents (e.g., microrobots, synthetic microorganisms). Descriptions are also relatively independent of the size and

page 24

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

Mapping the Territory of Computation Including Embodied Computation

25

number of the agents, so long as the continuum approximation is close enough. This serves our goal of having morphogenetic algorithms that scale well to very large numbers of agents and that are relatively independent of the agents’ size. To facilitate and test morphogenetic algorithms, we have developed a morphogenetic programming language, which allows “substances” to be defined, with properties and behaviors described by PDEs.68, 69 We can execute these algorithms on a conventional computer, but if they were realized in an appropriate physical medium (e.g., a massive swarm of programmed microrobots or genetically engineered microorganisms), then the result of execution would be the desired physical structure. We are currently developing global-to-local compilation algorithms to translate from the PDEs (which describe the behavior of continua of infinitesimal particles) to behavioral programs for finite numbers of agents of a specific size while maintaining a large range of scale independence.70 Through simulation we have demonstrated the application of artificial morphogenesis to several problems, including the routing of dense fiber bundles between regions of an artificial brain, and the assembly of an insect-like body frame, with specified numbers of spinal segments, legs, and leg segments.69–71

1.6. Conclusions Computation is a physical process, but of a very special kind in which the physical interactions can be described as information processes. For both scientific and technological reasons we should be exploring the full range of computational paradigms, both artificial and natural. One direction for future alternative computing paradigms is embodied computation, which focuses on the physical realization of computational processes. On the one hand, this suggests new ways that physical processes can be used for computation, thus providing directions for post-Moore’s law computing. On the other, it points towards applications of computing in which the physical processes realizing the computation are the purpose of the computation, thereby using programs as a general means of directly determining

page 25

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

26

b4205-v1-ch01

B. J. MacLennan

physical processes. In particular, the physical properties of materials can be controlled programmatically, and physical systems with complex hierarchical structures over many size scales can be assembled by morphogenetic algorithms. References 1. B. J. MacLennan, Natural computation and non-Turing models of computation. Theoretical Computer Science 317, 115–145 (2004). 2. B. J. MacLennan, Super-Turing or non-Turing? Extending the concept of computation. Int. J. Unconvent. Comput. 5(3–4), 369–387 (2009). 3. J. Lipka, Graphical and Mechanical Computation (Wiley, New York, 1918). 4. W. Thomson, Harmonic analyzer, Proc. Roy. Soci. 27, 371–373 (1878). 5. A. B. Clymer, The mechanical analog computers of Hannibal Ford and William Newell. IEEE Ann. Hist. Comput. 15(2), 19–34 (1993). 6. J. S. Small, The Analogue Alternative (Routledge, London & New York, 2001). 7. B. J. MacLennan, The promise of analog computation. Int. J. Gen. Syst. 43(7), 682–696 (2014). doi: 10.1080/03081079.2014.920997. 8. R. Landauer, Irreversibility and heat generation in the computing process. IBM J. Res. Develop. 5(3), 183–191 (1961). Reprinted, Vol. 44 No. 1/2, Jan./March 2000, pp. 261–269. 9. A. Berut, A. Arakelyan, A. Petrosyan, S. Ciliberto, R. Dillenschneider, and E. Lutz, Experimental verification of Landauer’s principle linking information and thermodynamics. Nature 483, 187–189 (2012). doi: 10.1038/ nature10872. 10. C. H. Bennett, Logical reversibility of computation. IBM J. Res. Develop. 17(6), 525–532 (1973). 11. C. H. Bennett, The thermodynamics of computation — A review. Int. J. Theo. Phys. 21(12), 905–940 (1982). 12. E. F. Fredkin and T. Toffoli, Conservative logic. Int. J. Theo. Phys. 21(3/4), 219–253 (1982). 13. C. H. Bennett, Notes on Landauer’s principle, reversible computation, and Maxwell’s Demon. Stud. Hist. Phil. Mod. Phys. 34, 501–510 (2003). 14. M. P. Frank, Introduction to reversible computing: Motivation, progress, and challenges. In CF ‘05: Proceedings of the 2nd Conference on Computing Frontiers, Ischia, Italy, May 4–6, 2005. 15. B. J. MacLennan, Bodies — Both informed and transformed: Embodied computation and information processing. In G. Dodig-Crnkovic and M. Burgin (eds.), Information and Computation: Essays on Scientific and Philosophical Understanding of Foundations of Information and Computation, vol. 2, World Scientific Series in Information Studies (World Scientific, Singapore, 2011), pp. 225–253.

page 26

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

Mapping the Territory of Computation Including Embodied Computation

27

16. B. J. MacLennan, The U-machine: A model of generalized computation. Int. J. Unconvent. Comput. 6(3–4), 265–283 (2010). 17. B. J. MacLennan, Technology-independent design of neurocomputers: The universal field computer. In M. Caudill and C. Butler (eds.), Proceedings of the IEEE First International Conference on Neural Networks, vol. 3, (IEEE Press, 1987), pp. 39–49. 18. B. J. MacLennan, Field computation in natural and artificial intelligence. Inform. Sci. 119, 73–89 (1999). 19. B. J. MacLennan, Field computation: A framework for quantum-inspired computing. In S. Bhattacharyya, U. Maulik, and P. Dutta (eds.), Quantum Inspired Computational Intelligence: Research and Applications, Chapter 3 (Morgan Kaufmann/Elsevier, Cambridge, MA, 2017). pp. 85–110. doi: http://dx.doi.org/10.1016/B978-0-12-804409-4.00003-6. 20. B. J. MacLennan, Topographic representation for quantum machine learning. In S. Bhattacharyya, I. Pan, A. Mani, S. De, E. Behrman, and S. Chakraborti (eds.), Quantum Machine Learning, Chapter 2 (De Gruyter, Berlin/Boston, 2020). 21. B. J. MacLennan, Foundations of Field Computation. URL http://web.eecs. utk.edu/∼bmaclenn/FFC.pdf. 22. T. van Gelder. Dynamics and cognition. In J. Haugeland (ed.), Mind Design II: Philosophy, Psychology and Artificial Intelligence, Chapter 16 (MIT Press, Cambridge, MA, 1997), pp. 421–450, revised & enlarged edn. 23. B. J. MacLennan, Transcending Turing computability. Minds Mach. 13, 3–22 (2003). 24. M. Johnson and T. Rohrer, We are live creatures: Embodiment, American pragmatism, and the cognitive organism. In J. Zlatev, T. Ziemke, R. Frank, and R. Dirven (eds.), Body, Language, and Mind, vol. 1 (Mouton de Gruyter, Berlin, 2007), pp. 17–54. 25. B. J. MacLennan, Embodied computation: Applying the physics of computation to artificial morphogenesis, Parallel Process. Lett. 22(3), 1240013 (2012). 26. H. Hamann and H. W¨ orn, Embodied computation. Parallel Process. Lett. 17(3), 287–298 (2007). 27. S. Stepney, The neglected pillar of material computation. Physica D 237(9), 1157–64 (2008). 28. M. Ercsey-Ravasz and Z. Toroczkai, Optimization hardness as transient chaos in an analog approach to constraint satisfaction. Nat. Phys. 7, 966–970 (2011). 29. B. Moln´ ar and M. Ercsey-Ravasz, Asymmetric continuous-time neural networks without local traps for solving constraint satisfaction problems. PLoS ONE 8(9), e73400 (2013). doi: 10.1371/journal.pone.0073400. 30. S. Haykin, Neural Networks and Learning Machines, 3rd edn. (Pearson Education, New York, 2008). 31. C. E. Shannon, Mathematical theory of the differential analyzer. J. Math. Phys. Mass. Inst. Technol. 20, 337–354 (1941).

page 27

August 2, 2021

28

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

B. J. MacLennan

32. C. E. Shannon. Mathematical theory of the differential analyzer. In N. J. A. Sloane and A. D. Wyner (eds.), Claude Elwood Shannon: Collected Papers (IEEE Press, New York, 1993), pp. 496–513. 33. M. Pour-El, Abstract computability and its relation to the general purpose analog computer (some connections between logic, differential equations and analog computers). Trans. Am. Math. Soc. 199, 1–29 (1974). 34. L. A. Rubel, The brain as an analog computer. J. Theoret. Neurobiol. 4, 73–81 (1985). 35. L. Lipshitz and L. A. Rubel, A differentially algebraic replacment theorem. Proc. Am. Math. Soci. 99(2), 367–72 (1987). 36. L. A. Rubel, Some mathematical limitations of the general-purpose analog computer. Adv. Appl. Math. 9, 22–34 (1988). 37. M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information, 10th anniversary edition (Cambridge University Press, Cambridge, 2010), 38. S. Lloyd and S. L. Braunstein, Quantum computation over continuous variables. Phys. Rev. Lett. 82, 1784–1787 (1999). doi: 10.1103/PhysRevLett. 82.1784. URL http://link.aps.org/doi/10.1103/PhysRevLett.82.1784. 39. O. Khatib, Real-time obstacle avoidance for manipulators and mobile robots. Int. J. Robot. Res. 5, 90–9 (1986). 40. M. Miller, B. Roysam, K. Smith, and J. O’Sullivan. Representing and computing regular languages on massively parallel networks. IEEE Trans. Neural Netw. 2, 56–72 (1991). 41. E. Rimon and D. Koditschek. The construction of analytic diffeomorphisms for exact robot navigation on star worlds. In Proceedings of the 1989 IEEE International Conference on Robotics and Automation, Scottsdale AZ (IEEE Press, New York, 1989), pp. 21–6. 42. O. Steinbeck, A. T´ oth, and K. Showalter, Navigating complex labyrinths: Optimal paths from chemical waves. Science 267, 868–71 (1995). 43. P. Ting and R. Iltis, Diffusion network architectures for implementation of Gibbs samplers with applications to assignment problems. IEEE Trans. Neural Netw. 5, 622–38 (1994). 44. J. W. Mills, B. Himebaugh, B. Kopecky, M. Parker, C. Shue, and C. Weilemann, “Empty space” computes: The evolution of an unconventional supercomputer. In Proceedings of the 3rd Conference on Computing Frontiers (ACM Press, New York, 2006), pp. 115–26. 45. A. Adamatzky, Physarum Machines: Computers from Slime Mould. World Scientific Series on Nonlinear Science Series A: Volume 74 (World Scientific, Singapore, 2010). 46. A. Adamatzky, B. De Lacy Costello, and T. Asai, Reaction-Diffusion Computers (Elsevier, Amsterdam, 2005). 47. A. Turing, The chemical basis of morphogenesis. Philos. Trans. Roy. Soci. B 237, 37–72 (1952). 48. J. D. Murray, Lectures on Nonlinear Differential-Equation Models in Biology (Oxford, Oxford, 1977).

page 28

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

Mapping the Territory of Computation Including Embodied Computation

29

49. P. K. Maini and H. G. Othmer (eds.), Mathematical Models for Biological Pattern Formation (Springer-Verlag, New York, 2001). 50. S. Camazine, J. Deneubourg, N. R. Franks, G. Sneyd, J. Theraulaz, and E. Bonabeau, Self-organization in Biological Systems (Princeton University Press, Princeton, New Jersey, 2001). 51. G. Forgacs and S. A. Newman, Biological Physics of the Developing Embryo (Cambridge University Press, Cambridge, UK, 2005). 52. T. Toffoli and N. Margolus, Programmable matter: Concepts and realization. Physica D: Nonlinear Phenomena 47(1), 263–272 (1991). ISSN 0167-2789. doi: https://doi.org/10.1016/0167-2789(91)90296-L. URL http://www.sciencedirect.com/science/article/pii/016727899190296L. 53. S. C. Goldstein, J. D. Campbell, and T. C. Mowry, Programmable matter, Computer 38(6), 99–101 (June, 2005). 54. B. J. MacLennan, The morphogenetic path to programmable matter, Proceedings of the IEEE 103(7), 1226–1232 (2015). doi: 10.1109/JPROC. 2015.2425394. 55. P. Rothemund and E. Winfree. The program-size complexity of selfassembled squares. In Symposium on Theory of Computing (STOC) (Association for Computing Machinery, New York, 2000), pp. 459–468. 56. R. Barish, P. Rothemund, and E. Winfree, Two computational primitives for algorithmic self-assembly: Copying and counting. Nano Lett. 5, 2586–92 (2005). 57. P. Rothemund, N. Papadakis, and E. Winfree, Algorithmic self-assembly of DNA Sierpinski triangles. PLoS Biology 2(12), 2041–53 (2004). 58. H. Kitano. Morphogenesis for evolvable systems. In E. Sanchez and M. Tomassini (eds.), Towards Evolvable Hardware: The Evolutionary Engineering Approach (Springer, Berlin, 1996), pp. 99–117. 59. R. Nagpal, A. Kondacs, and C. Chang. Programming methodology for biologically-inspired self-assembling systems. In AAAI Spring Symposium on Computational Synthesis: From Basic Building Blocks to High Level Functionality (March, 2003). URL http://www.eecs.harvard.edu/ssr/papers/ aaaiSS03-nagpal.pdf. 60. A. Spicher, O. Michel, and J. Giavitto. Algorithmic self-assembly by accretion and by carving in MGS. In Proceedings of the 7th International Conference on Artificial Evolution (EA ‘05), number 3871 in Lecture Notes in Computer Science (Springer-Verlag, Berlin, 2005), pp. 189–200. 61. S. Murata and H. Kurokawa, Self-reconfigurable robots: Shape-changing cellular robots can exceed conventional robot flexibility. IEEE Robot. Autom. Magaz. 71–78 (2007). 62. R. Doursat. Organically grown architectures: Creating decentralized, autonomous systems by embryomorphic engineering. In R. P. W¨ urtz (ed.), Organic Computing (Springer, 2008), pp. 167–200. 63. B. J. MacLennan, Morphogenesis as a model for nano communication, Nano Commun. Netw. 1(3), 199–208 (2010). doi: 10.1016/j.nancom.2010.09.007.

page 29

August 2, 2021

30

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch01

B. J. MacLennan

64. P. Bourgine and A. Lesne (eds.), Morphogenesis: Origins of Patterns and Shapes (Springer, Berlin, 2011). 65. J. Giavitto and A. Spicher. Computer morphogenesis. In P. Bourgine and A. Lesne (eds.), Morphogenesis: Origins of Patterns and Shapes (Springer, Berlin, 2011), pp. 315–340. 66. R. Doursat, H. Sayama, and O. Michel, A review of morphogenetic engineering. Nat. Comput. 12(4), 517–535 (2013). ISSN 1567-7818, 1572-9796. doi: 10.1007/s11047-013-9398-1. 67. H. Oh, A. Ramezan Shirazi, C. Sun, and Y. Jin, Bio-inspired self-organising multi-robot pattern formation: A review. Robot. Auton. Syst. 91, 83–100 (2017). ISSN 09218890. doi: 10.1016/j.robot.2016.12.006. URL https:// linkinghub.elsevier.com/retrieve/pii/S0921889016300185. 68. B. J. MacLennan, Preliminary development of a formalism for embodied computation and morphogenesis. Technical Report UT-CS-09-644, Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN (2009). 69. B. J. MacLennan, A morphogenetic program for path formation by continuous flocking. Int. J. Unconvent. Comput. 14, 91–119 (2019). 70. B. J. MacLennan and A. C. McBride, Swarm intelligence for morphogenetic engineering. In A. Schumann (ed.), Swarm Intelligence: From Social Bacteria to Human Beings (Taylor & Francis/CRC, 2020). 71. B. J. MacLennan. Coordinating swarms of microscopic agents to assemble complex structures. In Y. Tan (ed.), Swarm Intelligence, Vol. 1: Principles, Current Algorithms and Methods, PBCE 119, chapter 20 (Institution of Engineering and Technology, 2018), pp. 583–612.

page 30

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 0002

Chapter 2

Reversible Logic Element with Memory as an Alternative Logical Device Kenichi Morita Hiroshima University, Higashi-Hiroshima 739-8527, Japan [email protected] In the traditional design theory of logic circuits and digital computers, logic gates are supposed to be the basic logical primitives. Although operations of logic gates are easily understandable for humans and have a long history of investigation, they may not be a good basis for the recent computing paradigms that directly utilize microscopic physical/chemical phenomena as operations. We argue this problem in the framework of reversible computing. Here, we focus on a reversible logic element with memory (RLEM) as a candidate of such a basic device. We give a survey on the past results on RLEM, and examine advantages of using it as a logical primitive. By this, we can obtain new insights on reversible computing, which cannot be seen if a logic gate is used as a primitive. For example, a very unique architecture for reversible Turing machines using RLEMs is possible, and such machines are easily embedded in simple reversible environments.

2.1. Introduction The design theory of logic circuits based on logic gates, such as AND, OR, NOT, NAND, etc., was developed in the former half of 20th century. Since then logic gates have been used as standard logical primitives for conventional computers, and are still being used now. It should be noted that the notions of these logical operations were originally extracted from the human thinking and reasoning. In fact, operations of AND, OR and NOT have already been characterized by the philosophers of Stoic School and Megarian School in ancient 31

page 31

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

K. Morita

32

Greece by analyzing the human thinking (see Ref. [1]). Thus these operations themselves have quite a long history. On the other hand, recently, various computing paradigms that directly utilize microscopic physical/chemical phenomena appeared. They are, for example, molecular computing, quantum computing, reversible computing, and so on. Although operations of the conventional logical gates are easily understandable for humans, we should not be tied to this old framework when investigating new computing paradigms. There will surely be much more suitable primitives that reflect microscopic physical/chemical laws, and can be used to compose computing systems in the new paradigm in an efficient way. In this chapter, we consider the problem of finding an appropriate logical primitive for reversible computing. In other words, it is the problem of looking for a conceptual device that is suitable for connecting the paths from a reversible microscopic law to a reversible computer (Figure 2.1). Such a primitive must satisfy the following properties. First, it is directly related to the reversible microscopic laws, and hence realizable in a reversible environment. Second, reversible computers can be efficiently composed of it in a systematic way. Here, we consider a reversible logic element with memory (RLEM),2–4 rather than a reversible logic gate, as a candidate of such a device, and examine its advantages. The contents of the following sections are as follows. In Section 2.2, definitions on RLEMs is given, and a typical 2-state RLEM called a rotary element (RE) is illustrated. It is then explained that we can construct reversible sequential machines and reversible Turing machines out of REs in a very unique way. In Section 2.3, universality of RLEMs is discussed. It is remarkable that

Reversible microscopic law

Reversible logic element with memory

Reversible computer

Figure 2.1. A pathway from a reversible microscopic law to a reversible computer. It is important to give a suitable conceptual device on this pathway for an efficient realization of a reversible computer.

page 32

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

Reversible Logic Element with Memory as an Alternative Logical Device

33

all non-degenerate 2-state RLEMs except only four are universal. In Section 2.4, we argue possibilities of realizing RLEMs in reversible environments. We consider the billiard ball model, and a reversible cellular automaton with a very simple local transition function, as such reversible environments. In Section 2.5, based on these observations, we summarize the advantages of using an RLEM as a logical device. 2.2. Reversible Logic Element with Memory (RLEM) Reversible computing is a paradigm that reflects microscopic physical reversibility. So far, various kinds of reversible computing models have been proposed and investigated.3–8 A reversible logic element is a primitive for composing reversible logic circuits, which are further used to construct reversible computing machines. There are two types of reversible logic elements: one without memory, which is usually called a reversible logic gate, and one with memory. In both cases, the function of a reversible element is described by a one-to-one mapping. In the conventional design theory of logic circuits, logic gates are used as primitive elements (but in the study of asynchronous circuits, logic elements with memory are sometimes used9, 10 ). Also in reversible computing, many kinds of useful reversible logic gates have been proposed. A Fredkin gate6 and a Toffoli gate8, 11 are typical ones. However, as we shall see below, a reversible logic element with memory is also useful. 2.2.1. Definition of a reversible logic element with memory We first define a sequential machine (SM) and a reversible SM (RSM). An SM is a finite automaton with an output port as well as an input port, which is often called an SM of Mealy type. Definition 1. A sequential machine (SM) is defined by M = (Q, Σ, Γ, δ), where Q is a finite set of internal states, Σ and Γ are finite sets of input and output symbols, and δ : Q × Σ → Q × Γ is a move function. If δ is injective, M is called a reversible sequential

page 33

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

K. Morita

34

machine (RSM). Note that if M is reversible, then |Σ| ≤ |Γ| must hold. A reversible logic element with memory is defined as a kind of RSM. Definition 2. A reversible logic element with memory (RLEM) is an RSM M = (Q, Σ, Γ, δ) such that |Σ| = |Γ|. In particular, it is called a |Q|-state |Σ|-symbol RLEM. Consider an RLEM M = (Q, {a1 , . . . , an }, {s1 . . . . , sn }, δ). The move function δ of M gives the next state of M and the output symbol from the present state and the input symbol deterministically. If the present state is p, the input symbol is ai , and δ(p, ai ) = (q, sj ), then the next state is q and the output is sj as shown in Figure 2.2(a). To use an RLEM as a logic element for composing a logic circuit, we interpret it as a machine having “decoded” input ports and output ports as in Figure 2.2(b). That is, for each input symbol, there is a unique input port to which a signal (or a particle) can be given. Likewise, for each output symbol, there is a unique output port from which a particle can appear. Note that particles should not be given to two or more input ports at the same time. The operations of the RLEM M is undefined for the case that particles are given to two or more input ports t +1

t (a)

ai - State p . a. 1

(b)

.. - ai .. .. . . - an ..

p

s1 .. .. . . sj .. .. . . sn -

- State

q

. a. 1 .. - ai .. .. . . - an ..

q

-sj

s1 .. .. . . sj .. .. . . sn -

Figure 2.2. (a) A movement of a reversible logic element with memory (RLEM), and (b) an interpretation of an RLEM as a module having decoded input ports and output ports.

page 34

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

Reversible Logic Element with Memory as an Alternative Logical Device

35

simultaneously. It is, of course, possible to extend an RLEM so that it can receive two or more particles. However, we do not do so because of the following reasons. First, without such an extension, RLEMs are sufficiently powerful (see Section 2.3). Second, if we allow two or more input particles, a synchronization problem of many particles arises. That is, we must design a logic circuit very carefully so that particles arrive at each logic element at the same time. This is similar to the case of conventional circuits composed of logic gates. On the other hand, if we limit the number of input particles to one, then the sole input particle interacts with the internal state of the element, which is regarded as a “stationary” information. Therefore, the input can be given at any moment. By this, structures of logic circuits composed of RLEMs can be greatly simplified. In this respect, an RLEM as a logic element has completely different feature from that of a logic gate. 2.2.2. Rotary element (RE), a typical RLEM There are infinitely many kinds of RLEMs if we do not limit the numbers of states and symbols. In Section 2.3, 2-state RLEMs will be classified, and their universality is discussed. But, here, we consider a specific 2-state 4-symbol RLEM called a rotary element, since its operation is easily understood. A rotary element (RE)2 is depicted by a box that contains a rotatable bar inside (Figure 2.3). Two states of an RE are distinguished by the direction of the bar, and thus they are called state H and state V. There are four input lines and four output lines corresponding to the sets of input symbols {n, e, s, w} and n

n

n

n

? 6 e w - e w6 ?

? 6 e w - e w6 ?

State V

State H

s

Figure 2.3.

s

s

s

Two states of a rotary element (RE).

page 35

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

K. Morita

36

t +1

t n

n

n

6 ? e w s - e w 6 ? s s

⇒

t +1

t

n

n

6 ? e w s - e w 6 ? s s

n

(a)

Figure 2.4.

n 6 ? e w s - e w 6 ? s s n

6 ? e w s - e w 6 ? s s

⇒ (b)

Operations of RE: (a) the parallel case, and (b) the orthogonal case.

Table 2.1.

The move function δRE of RE. Input

Present state State H State V

n

e

Vw V s

s

Hw H n

w

Ve V n

H e H s

output symbols {n , e , s , w }. The rotatable bar is used to control the move direction of an input particle. When no particle exists, nothing happens on the RE. If a particle comes from the direction parallel to the rotatable bar, then it goes out from the output line of the opposite side without affecting the direction of the bar (Figure 2.4(a)). If a particle comes from the direction orthogonal to the bar, then it makes a right turn, and rotates the bar by 90◦ (Figure 2.4(b)). It is reversible in the following sense: From the next state and the output, the previous state and the input are uniquely determined. More precisely, an RE is defined as the following RSM: MRE = ({H, V}, {n, e, s, w}, {n , e , s , w }, δRE ), where δRE is given in Table 2.1. 2.2.3. Constructing reversible machines by REs Combining many RLEMs we can compose a reversible logic circuit. When connecting them, we assume the following: Each output port of an RLEM can be connected to only one input port of some other (maybe the same) RLEM. Thus, fan-out of an output is not allowed. Some input (output, respectively) lines of the RLEMs that are not

page 36

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

Reversible Logic Element with Memory as an Alternative Logical Device

37

connected to other RLEMs are specified as the input (output) lines of the whole circuit. Hence the circuit can also be regarded as a kind of a sequential machine. It is known that any RSM can be simulated by a circuit composed only of REs.4, 12 Note that a precise definition of the notion of simulation is given in Ref. [4]. We explain it by an example. Consider an RSM M0 = ({q1 , q2 , q3 }, {a1 , a2 }, {b1 , b2 }, δ0 }, where δ0 is given in Table 2.2. The RSM M0 is simulated by the circuit shown in Figure 2.5. It has three columns of REs, each of which corresponds to a state of M0 . If M0 ’s state is qj , then the bottom RE of the jth column is set to the state H. All other REs are set to V. The REs of the ith row corresponds to the input symbol ai as well as the output symbol bi . In Figure 2.5, the circuit is in the state q1 . If a particle is given to the line e.g. a2 , then after setting the bottom Table 2.2. The move function δ0 of an example of an RSM M0 . Input Present state q1 q2 q3

q1 a1

a1

-

q1 a2

a2

-

?

q2 a1 -

6 ? 6 ?

q1

6

q2 a2 -

a1

a2

q2 b1 q2 b2 q1 b2

q3 b2 q1 b1 q3 b1

?

q3 a1 -

6 ? 6 ?

q2

6

q3 a2 -

? 6 ? 6

- b1

- b2

?

q3

6

Figure 2.5. The RSM M0 implemented by REs.12 Here, M0 is in the state q1 since the bottom RE of the leftmost column is in the state H.

page 37

August 2, 2021

38

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

K. Morita

RE of the first column to V, the particle appears on the line q1 a2 . This means the crossing point of the second row and the first column is found. Since δ(q1 , q2 ) = (q3 , b2 ), this line is connected to the RE of the second row of the third column. By this, the bottom RE of the third column is set to H, and finally the particle appears on the output line b2 . Generalizing the above construction method, we can say the following: For any RSM we can construct a circuit out of REs that simulates the RSM. We define universality of an RLEM as follows. Definition 3. An RLEM is called universal if any RSM is simulated by a circuit composed only of copies of the RLEM. Theorem 1 (see Ref. [12]). A rotary element (RE) is universal. It is also possible to construct reversible Turing machines (RTMs) out of REs. An RTM is a deterministic TM having also backward deterministic property (see Refs. [3–5] for its definition). It is known that for any irreversible TM, there is an RTM that simulates the former and leaves no garbage information when it halts.5 Hence RTMs are computationally universal. Since a finite-state control and a tape cell of an RTM can be formalized as RSMs, it is easy to construct them using only REs.2, 4, 13 Here, we show a simple example. Figure 2.6 is a circuit that simulates an RTM Tparity that accepts the language {12n | n = 0, 1, . . .}, whose move function is specified by the following set of quintuples. {[q0 , 0, 1, R, q1 ], [q1 , 0, 1, L, qacc ], [q1 , 1, 0, R, q2 ], [q2 , 0, 1, L, qrej ], [q2 , 1, 0, R, q1 ]} For example, if an input string 0110 is given, it moves as follows: q0 0110 1 q1 110 10 q2 10 100 q1 0 10 qacc 01. In the left part of Figure 2.6 there is a circuit that simulates the finite-state control of Tparity . In the right part, infinitely many copies of a circuit that simulates a tape cell are placed. If we give a particle to the input port “Begin”, then it starts to compute. Finally, the particle comes out from the output port “Accept” or “Reject” depending on the input.

page 38

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

Reversible Logic Element with Memory as an Alternative Logical Device ?

?

-

Reject

Accept

-

6

6

6

?

6 -

-

? ? 6

? 6

6

-

-

-

-

6

6 ?

-

-

-

6

6 ?

Begin

-

?

6

Head

6

0

? -

-

-

-

-

6

6

1

-

-

-

-

6

6 ? 6

6

6

?

?

?

?

6

?

6

?

?

6

?

6

-

6 ?

6

?

?

•••

6

6 ?

6 ?

-

?

6

-

-

6

-

? -

?

?

6

?

-

-

?

6

6

6

6

-

-

-

6

?

-

?

-

6

? -

?

?

6

?

6

6

-

6

-

-

-

6

?

-

?

?

6

-

6

?

-

6

-

6

-

6

?

?

?

-

6

?

?

39

-

6 -

6 ?

?

6

1

6

0

Figure 2.6. RTM Tparity realized by a circuit made of REs. An example of its whole computing process is found in Ref. [13].

Detailed descriptions of this circuit as well as how it works are given in Refs. [4, 13]. In this way, any RTM can be realized by a circuit composed only of REs rather simply. If we try to compose an RTM using only reversible logic gates (or even when we can use irreversible logic gates), the circuit will become complex. One of the reasons why the circuit becomes complex is that synchronization of signals at each gate is necessary. 2.3. All Non-degenerate 2-State RLEMs Except Four are Universal There are infinitely many kinds of RLEMs besides an RE. Here, we consider the problem which RLEMs are universal. Surprisingly, nondegenerate 2-state RLEMs except only four are all universal.14 We explain how it is shown. We first classify 2-state RLEMs. The total number of 2-state k-symbol RLEMs is (2k)!, and they are numbered from 0 to (2k)! − 1 in some lexicographic order.15 To indicate that it is a k-symbol RLEM, the prefix “k-” is attached to its serial number like RLEM 4-289. Here, we use a pictorial representation of a 2-state RLEM. Consider, as an example, a 2-state 4-symbol RLEM 4-289 that is

page 39

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

K. Morita

40

Table 2.3. The move function δ4-289 of RLEM 4-289. Input Present state State 0 State 1

a b c d

-

a

b

c

d

0w 0y

0x 0z

1w 1z

1x 1y

-

State 0

w x y z

a b c d

-@ @ -@ @ @ @-

w x y z

State 1

Figure 2.7. Pictorial representation of the 2-state RLEM 4-289, which is equivalent to RE.

defined by the RSM M4-289 = ({0, 1}, {a, b, c, d}, {w, x, y, z}, δ4- 289 ), where the move function δ4-289 is given in Table 2.3. Then, it is represented by Figure 2.7, where two boxes correspond to the states 0 and 1, and solid and dotted lines in a box describe the input–output relation for each state. A solid line shows the state goes to another, and a dotted line shows the state remains unchanged. For example, if the RLEM 4-289 receives an input symbol c in the state 0, then it gives the output w and goes to the state 1. Similar to the case of RE, we interpret that each input/output symbol represents an occurrence of a signal at the corresponding input/output port. We can regard two RLEMs are equivalent if one can be obtained by renaming the states and/or the input/output symbols of the other. We see that RLEM 4-289 is equivalent to an RE. It has been shown that the numbers of equivalence classes of 2-state 2-, 3-, and 4-symbol RLEMs are 8, 24, and 82, respectively.15 Figure 2.8 shows all representative RLEMs in the equivalence classes of 2- and 3-symbol RLEMs. The representatives are so chosen that it has the smallest number in the class.

page 40

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

Reversible Logic Element with Memory as an Alternative Logical Device 2-0 (eq. to wires)

2-1 (eq. to wires)

2-2

-

-

- -

-

-

-

-

-H - HH -

2-4 -

-

-

-

2-3 -H - HH -

- -

2-5 (eq. to wires)

2-16 (eq. to wires)

2-17

-

-

-

-

-

-

-

-

-

3-0 (eq. to wires)

3-1 (eq. to wires)

3-3 (eq. to wires)

3-6

-

-

-

-

-

-

-

-

-@ @ @-

-

-@ @ @-

- - -

-H - HH H - HH -

3-7 -

-

-

-

-

-

3-9

3-18 -

-

-

3-19

3-60

-

(eq. to 2-4) -

-@ @ @-

- - -

-H - HH -

3-61

3-65 H -

-

-

-H H H-

3-11

-

-H - HH H - HH -

3-450(eq. to wires)

3-451

-

-

-

-

-

H -

(eq. to 2-2) -

-@ @ @-

(eq. to 2-3) -

-

-

-

- - -

-

-

3-64

- - -

-

-

-

-

-

-

-

-

3-23

-

-@@ -@ -

3-91 -

-

-

-

H -

-

3-21 (eq. to wires)

3-95 (eq. to 2-17) -

-

3-63

3-90

- - -

-

3-10 -

41

3-94 (eq. to wires) -

-H - HH -

-

H -

-

-

-

-

-

-H - HH H - H -

3-453 -

Figure 2.8. Representatives of 8 equivalence classes of 24 2-symbol RLEMs (top), and those of 24 equivalence classes of 720 3-symbol RLEMs (bottom).15 The indications “eq. to wires” and “eq. to 2-n” mean it is equivalent to connecting wires, and it is equivalent to RLEM 2-n, respectively. Thus they are degenerate ones.

Among k-symbol RLEMs, there are degenerate ones, each of which is either equivalent to simple connecting wires (e.g., RLEM 3-3), or equivalent to a k -symbol RLEM such that k < k (e.g., RLEM 3-6). Its precise definition is found in Refs. [4, 14]. In Figure 2.8, they are indicated by “eq. to wires” or “eq. to 2-n”. Thus, non-degenerate k-symbol RLEMs are the ones to be studied.

page 41

August 2, 2021

42

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

K. Morita

It is known that the numbers of non-degenerate 2-, 3- and 4-symbol RLEMs are 4, 14, and 55, respectively. It is known that the following three lemmas hold. Lemma 1 (see Refs. [14, 16]). An RE can be composed of RLEM 3-10. Lemma 2 (see Ref. [16]). RLEM 3-10 can be composed of RLEMs 2-3 and 2-4. Lemma 3 (see Ref. [14]). RLEMs 2-3 and 2-4 can be composed of any one of 14 non-degenerate 3-symbol RLEMs. By above, we obtain the next lemma. Lemma 4 (see Ref. [14]). An RE can be constructed by any one of 14 non-degenerate 3-symbol RLEMs. Lemmas 1–3 are proved by designing circuits composed of given RLEMs which correctly simulate the target RLEMs. Lemma 1 is proved by a circuit made of RLEM 3-10 that simulates an RE, which was first given in Ref. [16]. Later, a simpler circuit shown in Figure 2.9 was given in Ref. [14]. Next, Lemma 2 is proved by a circuit made of RLEMs 2-3 and 2-4 that simulates RLEM 3-10 shown in Figure 2.10.16 Finally, Lemma 3 is proved by 28 circuits composed of each of 14 non-degenerate 3-symbol RLEMS that simulate RLEMs 2-3 and 2-4 as in Figure 2.11.14 The following lemma gives a relation between k-symbol RLEMs and (k − 1)-symbol RLEMs. Lemma 5 (see Ref. [14]). Let Mk be an arbitrary non-degenerate k-symbol RLEM (k > 2). Then, there exists a non-degenerate (k−1)symbol RLEM Mk−1 that can be simulated by Mk . We explain only a key idea of Lemma 5. When a k-symbol RLEM is given, we choose one output line and one input line, and connect them to make a feedback loop. By this, we obtain a (k − 1)-symbol RLEM. Figure 2.12 shows the case of 4-symbol RLEM 4-23617.

page 42

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

Reversible Logic Element with Memory as an Alternative Logical Device

n-

e-

-

s-

-

H - HH

-

- s

w-

-

-

H - HH

- w

43

-

- n

- e

(a)

n-

e-

-

s-

-

-

H - HH

- s

w-

-

-

-

- w

H - HH

- n

- e

(b)

Figure 2.9. A circuit composed of RLEM 3-10 that simulates RE.14 (a) and (b) correspond to the states H and V of RE, respectively.

3-10

3-10 a b c

-

-

x -y -z

-

a b c

-

HH HH -

State 0

a b c

2-3 -

-

-

-

2-4

-

-

x y z

State 1 -

x y -z -

a b c

2-3 -

HH HH -

-

-

x y -z -

2-4

Figure 2.10. A circuit composed of RLEMs 2-3 and 2-4 that simulates RLEM 3-10.16 The lower left and the lower right figures correspond to the states 0 and 1 of RLEM 3-10.

If we give an appropriate feedback loop, we have a non-degenerate one (upper row of Figure 2.12). But, if we give an inappropriate feedback, the resulting RLEM becomes degenerate one (lower row of Figure 2.12). Generally, it is proved that for a given non-degenerate

page 43

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

page 44

K. Morita

44 2-3 a b

3-7

a b

-

2-4

- -

x y

-

-

x

-

y

a b

a b

-

x -y

-

-

x

-H HH H -

y

a b

-

-

3-9

3-10

-

-

-

x -y

a

-

-

x

a

-

y

b

-

a b

3-18

-

-

@ - @ @ @

a

3-23 3-60

-

-

-

a b

-

a b

-

-

b

x y

-

-

a b a b

3-63

a b

3-64

a b

3-65

a b

3-90

- - -

a

x y

- - -

-

a b

-

- - -

x

-

- - -

x

-

-

x

-

y

a b

y

a b

y

a b

-

-

HH HH -

a b

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-H HH H -

-

-

y

-

-

x y

-

x y

a b

-

-

-

x y

a b

-

-

x

-

y

a

-H HH HH H HH -

x y

a

-H HH H -

x y

b

-

x

a

-

y

b

x y

a

a

-

b

- -

x y

y

-

- -

x y

-

- - -

-

-

x

a

y

b

x y -

x y

-

- -

- -

-

x y

-

a b

-

-

-

-

y

b

-

-

-

-

-

@ - @ @ - @ -

x

-

-

-

-

x y

y

-

-

-

-H H -HH

-H HH HH H HH -

x y

x y

a b

-H HH H -

x y

-

x

-

y

-

-

-

-

-

a b

-

-

-

x y

a b -

a b

a

-

x

-H HH HH H H -

-

-

a b

b

-H H -HH

-

-

x y

b

-

a b

-

y

a b

x

-

x

x y

y

b

x y

@ - @ @ @ -

x

a

-H HH H -

-

-

b

-

-

-

-

a

- -

-

-

b

x y

-

a b

x y

x y -

-

a b

a

x y

-

-

x y

x y

@ - @ @ @ -

-

-

y

-H HH HH H H -

-

b

-

x

-H H -HH -

a

-

-

-

-H HH HH H H -

-

-

-

x y -

-

-

-

-

-

-

-

-

-

-H HH H -

-

-

-

-

a b

-

x y

-H HH HH H HH -

-

a b

@ - @ @ - @

-H H -HH -

x -y

-

x a b

-

-

-

-

a b

@ - @ @ @

-

a b

x y -

a b

a b

-

-

x y

-

a b

3-451

b

-H HH HH H HH -

x y

HH HH

a

y

a b

- -

-

-

x

@ - @ @ @ -

b

a b

-

-

-

x y

a b

-

-

x y

-

a b

3-91

-

x -y

@ - @ @ @

-

3-61

-

-

a b

b

3-453

b4205-v1-ch02

x y

x

-H HH HH H H -

a b -

-

-H H -HH -

y

-H HH HH H H -

-H HH HH H H -

-

x

-

y

-

Figure 2.11. Circuits composed of each of 14 non-degenerate 3-symbol RLEMs that simulate RLEMs 2-3 and 2-4.14

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

Reversible Logic Element with Memory as an Alternative Logical Device 4-RLEM Adding a feedback to 4-RLEM j

i -

45

Resulting 3-RLEM

j

i -

4-23617 0 1 Appropriate feedback

4-23617

j j

i

-

i

i

-

i

0 1 Equivalent to 3-451

j j

i

0 1 Inappropriate feedback

j

i

j

0 1 Equivalent to 3-450 (degenerate)

Figure 2.12. Making a 3-symbol RLEM by adding a feedback loop to 4-symbol RLEM 4-23614. If the feedback is appropriate, we obtain non-degenerate one (upper row). If not, we have degenerate one (lower row).

2-2

2-3

- x a -H -x a y b - HH - y b -

0

2-4

-x a a ybb -

1

Figure 2.13.

0

-x -y

2-17

ab-

1

- x a -H -x - y b - HH - y

0

1

ab-

-x a -x -y b - H y

0

1

Four non-degenerate 2-state 2-symbol RLEMs.

k-symbol RLEM (k > 2), we can always find a feedback loop by which a non-degenerate (k − 1)-symbol RLEM is obtained.14 By Theorem 1, and Lemmas 4 and 5 we have the next theorem stating that almost all non-degenerate 2-state RLEMs are universal. Note that universal RLEMs can simulate each other. Theorem 2 (see Ref. [14]). Every k-symbol RLEM is universal if k > 2.

non-degenerate

2-state

There are four non-degenerate 2-state 2-symbol RLEMs (Figure 2.13). So far, three of them have been shown to be nonuniversal. Lemma 6 (see Ref. [17]). RLEM 2-2 can simulate neither RLEM 2-3, 2-4 nor 2-17.

page 45

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

K. Morita

46

Every 2-state k-symbol RLEM k-n (k > 2)

universal

?I RLEM 2-3

I R

RLEM 2-17

I

6 ? RLEM 2-2

R RLEM 2-4

non-universal

Figure 2.14. A hierarchy among 2-state RLEMs.17 Here, A → B (A → B, respectively) represents that A can (cannot) simulate B.

Lemma 7 (see Ref. [17]). RLEM 2-3 can simulate neither RLEM 2-4 nor 2-17, and RLEM 2-4 can simulate neither RLEM 2-3 nor 2-17. Theorem 3 (see Ref. [17]). RLEMs 2-2, 2-3 and 2-4 are nonuniversal. The following lemma says RLEM 2-2 is the weakest one among non-degenerate 2-state RLEMs, Lemma 8 (see Ref. [17]). RLEM 2-2 is simulated by any one of RLEMs 2-3, 2-4 and 2-17. Figure 2.14 summarizes the above results. It is not known whether RLEM 2-17 is universal or not. On the other hand, it is shown that any two combination among RLEMs 2-3, 2-4 and 2-17 is universal.16, 17 2.4. Realizing RLEMs in Reversible Environments We argue how RLEMs can be realized in reversible environments. In our present technology, it is difficult to implement RLEMs in a practical physical system having reversibility in the nano-scale level.

page 46

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

Reversible Logic Element with Memory as an Alternative Logical Device

47

However, thought experiments will suggest possibilities of realizing them in the future. Here, we consider two models of reversible environments. They are the billiard ball model, which is a reversible physical model of computing, and a reversible cellular automaton with an extremely simple local transition function. In these models, computation is performed by collisions of moving objects. This kind of computing paradigm is sometimes called collision-based computing.18 The following implementations, in particular, use such collisions to realize RLEMs directly. Therefore, they do not have a direct relation to the gate operations such as AND, OR, and NOT. 2.4.1. RLEMs in the billiard ball model The billiard ball model (BBM) is an idealized mechanical model proposed by Fredkin and Toffoli.6 It performs a kind of computation by elastic collisions of balls and reflectors. They showed a Fredkin gate, a universal reversible logic gate, is realizable in the BBM. It is known that an RE is simulated by a circuit composed of 12 Fredkin gates.4 Hence, an RE is in principle realizable in the BBM. But, it is not a good method to compose an RE using Fredkin gates. First, it needs 12 copies of the BBM configuration of a Fredkin gate, and thus the whole BBM configuration will be complex. Furthermore, as long as we use logic gates, we shall suffer from the synchronization problem of signals. That is, timing of two or more signals should be exactly adjusted at each gate. There is a good method of directly realizing an RE in the BBM. Figure 2.15 shows the BBM configuration that simulates an RE.3 It consists of one stationary ball called a state-ball, and many reflectors indicated by small rectangles. A state-ball is placed at the position of H or V in Figure 2.15 depending on the state of the simulated RE. A moving ball called a signal-ball can be given to any one of the input lines n, e, s, and w. Then, the operation of an RE is correctly simulated by collisions of balls and reflectors. Figure 2.16 is the case where an RE is in the state V and a signal-ball is given to s, which corresponds to Figure 2.4(a). This

page 47

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

page 48

K. Morita

48

n

n

6

w

s0

I -

w

V

I

n1

R

I

I

-

e

n0

R

w1

I

R

R

H

I 6

I

w0

?

R

e

I

s1

I

?

R

I

R

e1 R R

e0 6

R

?

s

A BBM configuration that simulates an RE.3

Figure 2.15.

n

s

n

n

6

w

6

w

e V

e V

- e

w

n

-

w

e

⇒

?

s

Figure 2.16.

s

?

s

s

The process of simulating δRE (V, s) = (V, n ) of RE in the BBM.

is a trivial case, where the signal-ball simply goes straight ahead without interacting the state-ball and the reflectors. It finally comes out from the output port n . Figure 2.17 shows the case where the RE is in the state H and a signal-ball is given to s, which corresponds to Figure 2.4(b). The

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

Reversible Logic Element with Memory as an Alternative Logical Device n

n

6

s0 R

w

-

e

w

e

w

s0

e

R -

s1 I

I

I

I I

H 6

H 6

R

?

?

s n

s n 6

R

w

s n 6

s0 -

e

w

e

w

s0

e

R -

s1 I

I

I

I I

6

6

R

?

?

s n

s n 6

R

V -

e

w

e

w

s0

e

R

V -

s1 I

s n 6

s0

w

e

s1

I

I

I

I

I 6

6

R

R

?

s

Figure 2.17.

R

s n

e

s1 I

w

R

s n

e

s1 I

w

49

n

n

6

w

page 49

s

?

s

s

The process of simulating δRE (H, s) = (V, e ) of RE in the BBM.

signal ball collides with the state ball, and they travel along the trajectories s0 and s1 in Figure 2.15, respectively. Then the balls again collide. By the second collision, the state ball stops at the position V, while the signal ball continues to travel rightward and finally goes out from the port e . In the realization of a Fredkin gate in the BBM,6 all the balls must have the same velocity. Furthermore, they should arrive at

August 2, 2021

50

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

K. Morita

the gate exactly at the same time. In the direct realization of an RE (Figure 2.15), however, a signal-ball can arrive at the RE at any moment and with any speed, since the state-ball is stationary in an idle state. Hence, there is no need of synchronizing many balls. This is an advantage of using an RLEM rather than a reversible logic gate. This method is extended to show any m-state k-symbol RLEM can be realized in BBM in a systematic way when k ≤ 4.19 2.4.2. RLEM in a simple reversible cellular automaton A reversible cellular automaton (RCA) is an abstract spatiotemporal model of a reversible physical space. An RCA is a CA whose global transition function is injective, where the global transition function is induced by the local transition function. Although its local transition function may not be directly related to a reversible physical law, we can know which kind of primitive reversible operations are required for universal computing by studying simple RCA models. Hence it will give an insight for realizing reversible computers in a physically reversible environment. Computational universality of a 2D RCA was first shown by Toffoli.7 Later, in Ref. [20], the framework of a partitioned cellular automaton (PCA) was proposed, and universality of a 1D RCA was proved by using it. A PCA is a subclass of a standard CA, and is useful for designing an RCA, since injectivity of the local transition function is equivalent to that of the global transition function. Here, we use a 2D triangular partitioned cellular automaton (TPCA). Its cellular space is shown in Figure 2.18(a). Each equilateral triangle in the space is called a cell, which is further divided into three parts (i.e., three isosceles triangles). Each part has a finite set of states. Hence, the state set of a cell is the Cartesian product of the sets of states of the three parts. State transition of a cell is determined by a set of local transition rules of the form given in

page 50

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

Reversible Logic Element with Memory as an Alternative Logical Device

r

l

-

51

l r d

d

(a)

(b)

Figure 2.18. Triangular partitioned cellular automaton (TPCA). (a) Its cellular space, and (b) a local transition rule.

-

, 0

-

•

• • 3

,

•

•

-

• 4

,

•

• •

-

•• • 7

Figure 2.19. Local function of the reversible ETPCA 0347 defined by four local transition rules.

Figure 2.18(b). That is, the next state of a cell is determined by the present states of the three parts adjacent to the cell. A TPCA is called isotropic (or rotation-symmetric), if the set of local transition rules satisfies the following condition: For each local transition rule in the set, a rule obtained by rotating the both sides of it by a multiple of 60 degrees is contained in the set. A TPCA is called an elementary triangular partitioned cellular automaton (ETPCA),4, 21 if it is isotropic and each part of a cell has the state set {0, 1}. Hereafter the states 0 and 1 are represented by a blank and a particle (i.e., •), respectively. We consider a particular ETPCA No. 0347 whose local transition function is shown in Figure 2.19. Note that, in an ETPCA, its local transition function is determined by only four local transition rules, and hence it is extremely simple. The number 0347 is obtained by reading the particle patterns of the right-hand sides of four local transition rules as binary numbers. We can see ETPCA 0347 is reversible, since its local function is injective. It is known that there are 256 ETPCAs in total, and there are 36 reversible ones among them.4

page 51

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

K. Morita

52 t =0 •

t =1 •

••

••

••

••

•

••

Figure 2.20.

•

••

•

•

•

••

• ••

t =6

t =5

t =4 •

• •• ••

•

t =3

t =2

•

••

••

•

• ••

••

A space-moving pattern called a glider in reversible ETPCA 0347.

a b c d

- -

-

w x y z

a b c d

-@ @ @-

w x y z

State 0

State 1

Figure 2.21.

RLEM 4-31.

There is a space-moving pattern called a glider (Figure 2.20) in the reversible cellular space of ETPCA 0347. There are also several useful primitive patterns besides a glider. Interacting these patterns, we can observe various interesting phenomena in ETPCA 0347.21 It is known that RLEM 4-31 (Figure 2.21) can be directly embedded in the cellular space of reversible ETPCA 0347,22 where “directly” means “without using reversible logic gates”. Here, we illustrate its outline. Combining primitive patterns, we can construct a pattern that simulates RLEM 4-31 as shown in Figure 2.22. Around the center of this pattern there are two small circles in which a pattern called a fin can be placed. If the fin is in the lower (upper, respectively) circle, we assume the simulated RLEM is in the state 0 (1). In Figure 2.22 the fin is in the lower circle. If we give a glider from the input port d, it moves along the lines in Figure 2.22. By this, the fin is shifted to the upper position, and finally the glider goes out from the output port w. In this way, the operation of RLEM 4-31 is correctly simulated.

page 52

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

Reversible Logic Element with Memory as an Alternative Logical Device

Figure 2.22.

53

RLEM 4-31 realized in ETPCA 0347.22

The reason why we use RLEM 4-31 (Figure 2.21) rather than RLEM 4-289 (Figure 2.7), which is equivalent to a rotary element, is that in RLEM 4-31 the total number of transitions from a state to the different state (i.e., the number of solid lines) is fewer than that in RLEM 4-289. Hence RLEM 4-31 is easier to implement in ETPCA 0347. In addition, an RTM can be constructed out of RLEM 4-31 easily as in the case of RLEM 4-289. It is shown any RTM can be systematically composed of RLEM 43.23 Figure 2.23 shows the circuit made of RLEM 4-31 that simulates the RTM Tparity , which is given in Section 2.2.3. Hence, the circuit has the same function as the one shown in Figure 2.6. By replacing each occurrence of RLEM 4-31 in Figure 2.23 by the pattern in Figure 2.22, we can obtain a configuration shown in Figure 2.24 that simulates the RTM Tparity in the reversible cellular space of ETPCA 0347. We made an emulator of ETPCA 0347 that works on Golly,25 a high speed CA simulator. In particular, how the pattern in Figure 2.22 works, and how the computation is carried out by the configuration in Figure 2.24 are seen using the emulator file given in Ref. [24].

page 53

q2

qˆ2

qa

qr

-

-

-

XXX XX -

-

-

XXX XX

-

-

-

XXX XX -

-

-

XXX XX

-

-

-

XXX XX -

-

-

XXX XX

-

-

-

XXX XX -

-

-

XXX XX

-

-

-

XXX XX

-

-

-

XXX XX

-

-

-

XXX XX

-

-

-

XXX XX

66

[q0, 0, 1, R, q1]

6

6

[q1, 1, 0, R, q2] [q2, 1, 0, R, q1]

-

-

-

XXX XX

-@ - @ @ XXX XX

-

-

-

XX XXX

-

-

-

XXX XX

-@ - @ @ XXX XX

-

-

-

XXX XX

-

-

-

XXX XX

-@ @ - @ XXX XX -@ - @ @ XXX XX

-

-

-

XXX XX -

-

-

XX XXX

-

-

-

XXX XX -

-

-

XXX XX

-@ @ - @ XX XX X

-

-

-

XXX XX

-

-

-

XX XX X

-@ - @ @ XX XX X

-

-

-

XXX XX

-

-

-

XX XX X

-@ @ - @ XX XX X

-

-

-

XX X XX

-

-

-

XXX XX

head

Figure 2.23.

RTM Tparity realized by a circuit composed of RLEM 4-31.23

•••

b4205-v1-ch02

-

-

-

XXX XX

6

[q2, 0, 1, L, qr] [q1, 0, 1, L, qa]

-@ @ - @ XXX XX

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

qˆ1 -

-

-

XXX XX -

-

-

XX XXX

-

-

-

16:49

q1 -

-

-

XXX XX -

-

-

XXX XX

-@ @ - @

K. Morita

qˆ0 -

-

-

XXX XX -

-

-

XX XXX

0

-

-

-

August 2, 2021

Begin

1

54

0

Reject Accept

page 54

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

Reversible Logic Element with Memory as an Alternative Logical Device

Figure 2.24. on Golly.

55

Configuration of RTM Tparity 24 realized in ETPCA 0347 simulated

2.5. Concluding Remarks In this chapter, we examined properties of a reversible logic element with memory (RLEM) as an alternative logical device for reversible computing. From them, we observed that RLEMs have several advantages over the traditional logic gates. In Section 2.2, it is seen that any reversible Turing machine can be realized as a circuit consisting only of copies of a rotary element (RE) with a very unique and simple structure. Such an architecture is possible mainly because there is no need for synchronizing two or more signals in a circuit made of RLEMs. In Section 2.3, it is seen that all non-degenerate RLEMs except only four are universal. Thus, it expands the possibility of implementing a universal RLEM in a reversible physical system. In Section 2.4, it is shown how RLEMs are realized in the two models of reversible environments, which are the billiard ball model and the reversible cellular automaton ETPCA 0347. Although these models are idealized or artificial ones, they give insights for realizing an RLEM using fundamental reversible physical phenomena. In particular, ETPCA 0347 shows that even from an extremely simple reversible local transition rules an RLEM and reversible Turing machines can be realized as configurations of reasonable sizes. Though an RLEM is a candidate of a conceptual device, and there may be other possible candidates, we saw it has several advantages over a reversible logic gate for reversible computing. Hence, for many other future computing paradigms, such an investigation will also help to find good pathways (Figure 2.1) for their realization.

page 55

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

56

b4205-v1-ch02

K. Morita

References 1. J.-M. Boche´ nski, Ancient Formal Logic (North-Holland, Amsterdam, 1951). 2. K. Morita, A simple reversible logic element and cellular automata for reversible computing. In M. Margenstern, Y. Rogozhin (eds.), Proceedings of the MCU 2001 LNCS 2055, pp. 102–113 (2001). doi: 10.1007/3-540-451323 6. 3. K. Morita, Reversible computing and cellular automata — A survey. Theoret. Comput. Sci. 395, 101–131 (2008). doi: 10.1016/j.tcs.2008.01.041. 4. K. Morita, Theory of Reversible Computing (Springer, Tokyo, 2017). doi: 10.1007/978-4-431-56606-9. 5. C.-H. Bennett, Logical reversibility of computation. IBM J. Res. Dev. 17, 525–532 (1973). doi: 10.1147/rd.176.0525. 6. E. Fredkin and T. Toffoli, Conservative logic. Int. J. Theoret. Phys. 21, 219– 253 (1982). doi: 10.1007/BF01857727. 7. T. Toffoli, Computation and construction universality of reversible cellular automata. J. Comput. Syst. Sci. 15, 213–231 (1977). doi: 10.1016/S00220000(77)80007-X. 8. T. Toffoli, Reversible computing. In J.W. de Bakker, J. van Leeuwen (eds.), Automata, Languages and Programming, LNCS 85, (1980), pp. 632–644. doi: 10.1007/3-540-10003-2 104. 9. H.-K. B¨ uning and L. Priese, Universal asynchronous iterative arrays of Mealy automata. Acta Informatica. 13, 269–285 (1980). doi: 10.1007/BF00288646. 10. R.-M. Keller, Towards a theory of universal speed-independent modules. IEEE Trans. Computers. C-23, 21–33 (1974). doi: 10.1109/T-C.1974.223773. 11. T. Toffoli, Bicontinuous extensions of invertible combinatorial functions. Math. Syst. Theory. 14, 12–23 (1981). doi: 10.1007/BF01752388. 12. K. Morita, A new universal logic element for reversible computing. In C. Martin-Vide, and V. Mitrana (eds.), Grammars and Automata for String Processing (Taylor & Francis, London, 2003), pp. 285–294. doi: 10.1201/9780203009642.ch28. 13. K. Morita, Constructing a reversible Turing machine by a rotary element, a reversible logic element with memory. Hiroshima University Institutional Repository, http://ir.lib.hiroshima-u.ac.jp/00029224 (2010). 14. K. Morita, T. Ogiro, A. Alhazov, and T. Tanizawa, Non-degenerate 2state reversible logic elements with three or more symbols are all universal. J. Multiple-Valued Logic Soft Comput. 18, 37–54 (2012). 15. K. Morita, T. Ogiro, K. Tanaka, and H. Kato, Classification and universality of reversible logic elements with one-bit memory. In M. Margenstern (ed.), Proc. MCU 2004, LNCS 3354, (2005), pp. 245–256. doi: 10.1007/978-3-54031834-7 20. 16. J. Lee, F. Peper, S. Adachi, and K. Morita, An asynchronous cellular automaton implementing 2-state 2-input 2-output reversed-twin reversible elements. In H. Umeo et al. (eds.), Proc. ACRI 2008, LNCS 5191, (2008), pp. 67–76. doi: 10.1007/978-3-540-79992-4 9.

page 56

August 2, 2021

16:49

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch02

Reversible Logic Element with Memory as an Alternative Logical Device

57

17. Y. Mukai, T. Ogiro, and K. Morita, Universality problems on reversible logic elements with 1-bit memory. Int. J. Unconvent. Comput. 10, 353–373 (2014). 18. A. Adamatzky (ed.), Collision-Based Computing (Springer, 2002). doi: 10.1007/978-1-4471-0129-1. 19. Y. Mukai and K. Morita, Realizing reversible logic elements with memory in the billiard ball model. Int. J. Unconvent. Comput. 8, 47–59 (2012). 20. K. Morita and M. Harao, Computation universality of one-dimensional reversible (injective) cellular automata. Trans. IEICE E72, 758–762 (1989). http://ir.lib.hiroshima-u.ac.jp/00048449. 21. K. Morita, A universal non-conservative reversible elementary triangular partitioned cellular automaton that shows complex behavior. Nat. Comput. 18(3), 413–428 (2019). doi: 10.1007/s11047-017-9655-9. 22. K. Morita, Finding a pathway from reversible microscopic laws to reversible computers. Int. J. Unconvent. Comput. 13, 203–213 (2017). 23. K. Morita and R. Suyama, Compact realization of reversible Turing machines by 2-state reversible logic elements. In O.H. Ibarra, L. Kari, S. Kopecki (eds.), Proc. UCNC 2014 LNCS 8553, (2014), pp. 280–292. doi: 10.1007/978-3-31908123-6 23. Slides with figures of computer simulation: Hiroshima University Institutional Repository, http://ir.lib.hiroshima-u.ac.jp/00036076. 24. K. Morita, Reversible world: Data set for simulating a reversible elementary triangular partitioned cellular automaton on Golly. Hiroshima University Institutional Repository, http://ir.lib.hiroshima-u.ac.jp/00042655 (2017). 25. A. Trevorrow, T. Rokicki, T. Hutton et al., Golly: An open source, crossplatform application for exploring Conway’s Game of Life and other cellular automata, http://golly.sourceforge.net (2005).

page 57

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 0003

Chapter 3

Space Bounded Scatter Machines João Alves Al´ırio∗ , José Félix Costa† and Lu´ıs Filipe Fonseca‡ Department of Mathematics, Instituto Superior Técnico, and CFCUL — Centro de Filosofia das Ciˆ encias da Universidade de Lisboa Av. Rovisco Pais 1, Lisbon 1049-001, Portugal ∗ joao [email protected] † [email protected] ‡ [email protected] We studied deterministic Turing machines with access to oracles that give imprecise answers and take time to consult (on the size of the queries) to model situations where computers process sensor data from their environment. The imprecision of the oracle gives rise to probabilistic behaviour of Turing machines. Turing machines clocked in polynomial time were classified according to the degree of precision provided by the analogue oracle. In this chapter we attempted to repeat our previous study, but now addressing bounded resources in space. We present a characterization of Turing machines bounded in polynomial space having access to stochastic timed oracles into computational classes depending on the pecision of the oracle or, in another way, the analogue-digital protocol.

3.1. Introduction If one wonders why the real numbers come into the natural sciences, the most common answer is that reality is easier to model and forecast in the continuum, mainly due to the success and the development of Calculus. Thus, when, by the end of 1930, a model of Vannevar Bush’s analogue computer was developed by Claude Shannon,1 it resulted in a system of differential equations of a particular kind, describing a network of mechanical integrators, 59

page 59

August 2, 2021

60

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

where input and output are real valued variables. Initial conditions or, in the most general setting of more than one dimension, boundary conditions are given as inputs. Real numbers may encode noncomputable information in different degrees (see Ref. [32]), but the way they are used in Bush’s analogue computer does not permit to decipher their potential information content and possibly decide the undecidable. Moving from continuous to discrete time analog computation, we find the Analogue Recurrent Neural Net (ARNN) model, see Ref. [2], a well-known discrete time computational system that computes beyond the Turing model. This feature is common to dynamic systems that are universal and able to extract every digit of the expansion of an internal real-valued parameter. These dynamic systems behave like a technician improving his measurements (using better and better equipment): they can perform a measurement of O(n) bits of the binary expansion of a parameter in linear time and use these sequences of bits as advice to decide upon inputs of size n. By the end of the nineties, the ARNN became a model of what a discrete time dynamic system with real parameters can compute in a polynomial number of steps on the size of the input. (In one way, the fact that the weights of the network are real numbers is not that much conspicuous since, as “physical” models, neural networks have been treated, since the seventies, as models of cognition involving real weights (see Ref. [3]) either in learning activities (supervised or unsupervised) or in classification tasks.) However, the persistence of real numbers in a computational model can be seen as a priori embedding of the information one wants the system to extract later (see Refs. [4, 5]). Nevertheless, the ARNN model exhibits a very interesting structural property: as the type of the weights vary from the integer numbers Z to the rational numbers Q to the real numbers R, the computational power of the ARNN increases from the class of regular languages to the class of recursive languages to the class of all languages (being P/poly in polynomial time.) A real number can be seen as an oracle or advice to a Turing machine. Starting from a design we did in 2006 to answer Martin Davis suggestion in Refs. [4, 5], we considered in Refs. [6–8] the

page 60

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Space Bounded Scatter Machines

b4205-v1-ch03

61

experimenter (e.g., the experimental physicist) as a Turing machine and the experiment of measurement (using a specified physical apparatus) as an oracle to the Turing machine (see Ref. [33]). The algorithm running in the machine abstracts the experimental method of measurement (encoding the recursive structure of experimental actions) chosen by the experimenter. In Refs. [8,9], we uncover three types of experiments of measurement to find approximations to real numbers in Physics. Some values can be determined by successive approximations, approaching the unknown value by dyadic rationals above and below that value (see Ref. [10] for a universal measurement algorithm relative to two-sided experiments). Fundamental measurement of distance, angle, mass, etc., fall into this class. A second type of experiment was considered, e.g., the measurement of the threshold of a neuron in Ref. [11]. We can approach the desired value only from below the threshold (one-sided experiments). A third type of measurement was very recently discussed in Ref. [12]. Turing machines having access to measurements can compute above the Turing limit. However, in the controversial supposition that real numbers can be considered in the real world, no one knows how to engineer them into a dynamic system. Natural or artificial systems involving real-valued magnitudes may not be fully simulable, that is, they may execute non-algorithmic computations. (In Ref. [13], the author questions how can we have computation without an “algorithm”?! He states that “one might compare this feature to the theory of evolution based on natural selection that is a processlevel theory for which the existence of some a priori algorithm is problematic”.) We also realized that the Theory of Measurement (see [Refs. 14–16]) did not take into account the physical time needed for a measurement of increasing precision (as a function of precision). The time complexity of a measurement reduces the computational power of dynamic systems with self-advice from their internal parameters. According to Beggs et al.,17 this reduction of superTuring capabilities can be so great that the real numbers add no further power, even assuming that the reals exist beyond the discrete nature of matter and energy. In the best scenario, we are still

page 61

August 2, 2021

62

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

waiting for some evidence that refutes the following conjecture: No reasonable physical measurement has an associated measurement map performable in polynomial time in the number of digits of precision. The ARNN departs from being a realistic physical model in that its dynamics exhibit discontinuous derivatives, for example, not in agreement with conventional neural nets (criticism not accepted in Ref. [31]). With a more realistic (analytic) activation function of the neurons, the time to read the next bit of a real weight is exponential in the number of bits already extracted. Measurements should be regarded as information with possible error that take time to consult. The complexity classes involved in such computations bounded to a polynomial number of steps were fully characterized in Refs. [9,11,12]. In Ref. [18], we synthesized our findings stating that in the best scenario the power of the system drops from common computations having access to polynomial long advices to common computations on help by just sublogarithmic long advices. In Ref. [19], we question: (a) what information content can we read from physical experiments, for example, of measurement? and (b) by connecting abstract measurement apparatus (analogue component) with abstract devices such as Turing machines (digital component), what computational classes can we define? The model we developed includes the ARNN model as well as other models of discrete time computation involving real numbers. The complexity classes we analyzed thus far (see Refs. [6, 19–21]), relative to polynomial time bounds, are summarized in the table of Figure 3.1. This chapter introduces for the first time the computational power of the same abstract computation models when bounded in polynomial space. Moreover, we also introduce a new communication protocol between the analogue and the digital components of the models.

3.2. First Concepts and Promised Results We give especial attention to deterministic Turing machines that have access to a stochastic oracle. This oracle is a physical experiment

page 62

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Space Bounded Scatter Machines

b4205-v1-ch03

63

of measurement which has an inherent error that is responsible by the stochastic behavior to our machine. The Turing machine itself behaves as if it were the experimenter. A bounded error probabilistic Turing machine M is said to decide a set A if there is a rational number 0 < ε < 1/2 such that, if w ∈ A then M rejects w with probability less than ε and if w ∈ / A, then M accepts w with probability at most ε. Usually a standard machine performs a single transition when consulting an oracle, but since we are recurring to a timed measurements (where time is intrinsic to the measurement), the machine takes a number of steps instead. As discussed in Section 3.1, in the last decades, Turing machines bounded in polynomial time, preforming measurements, were thoroughly studied (e.g., see Refs. [6–12,17,18,20–26]). The next natural step is to try to understand what can be computed by restricting the resources of the machine to polynomial space. Although we thought that space boundedness would not bring any interesting results to our theory, the fact is that it was a gap in our project started in 2007. We consider now extended deterministic Turing machines with three components: the analogue component or measurement, a deterministic Turing machine, and an analog-digital protocol between the measurement and the Turing machine. One concrete and easyto-understand version of the analog-digital machine is the scatter machine experiment that aims to measure the vertex position of a wedge that can be either sharp or smooth. The scatter experiment with the sharp vertex was introduced by Beggs and Tucker27 and the experiment with the smooth vertex was introduced by Edwin Beggs, José Félix Costa and John Tucker in Ref. [24]. The Turing machine equipped with an oracle as experiment was introduced for the first time in Refs. [6, 20]. We are going to consider computational classes such as BPPSPACE, characterized by error probabilistic Turing machines bounded in polynomial space. An advice function f : N → Σ is a total function which assigns a word for each (input size) n ∈ N. A prefix advice function is an advice function f such that, for any n, m ∈ N, if n < m, then f (n)

page 63

August 2, 2021

64

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

is a prefix of f (m). If C is a class of sets and F a class of advice functions, we denote by C/F (respectively, C/F ) the class of sets A for which there exist a set B ∈ C and an advice function (respectively, prefix advice function) f ∈ F such that, for every w ∈ Σ , w ∈ A if and only if w, f (|w|) ∈ B. In particular, for sparse oracles we get the classes PSPACE/poly and BPPSPACE/poly, already included in Ref. [28]. This definition forces us to choose first the machine and only then the advice function, so we provide a different definition: if F is a class of advice functions, we denote by BPP//F (respectively, BPPSPACE//F ) the class of sets A for which there exist a probabilistic advice Turing machine M clocked in polynomial time (respectively, bounded in polynomial space), a constant ε with 0 < ε < 1/2 and an advice function f ∈ F such that, for every w ∈ Σ (a) if w ∈ A, then M rejects w, f (n) with probability at most ε and (b) if w ∈ / A, then M accepts w, f (|n|) with probability at most ε. This gives us two new classes: if F is a class of advice functions, we denote by BPP//F (respectively, BPPSPACE//F the class of sets A for which there exist a probabilistic advice Turing machine M clocked in polynomial time (respectively, bounded in polynomial space), a constant ε with 0 < ε < 1/2 and a prefix advice function f ∈ F such that, for every n ∈ N and w ∈ Σ with |w| ≤ n (a) if w ∈ A, then M rejects w, f (n) with probability at most ε and (b) if w ∈ / A, then a M accepts w, f (n) with probability at most ε. The ultimate results regarding time bounded Turing machines coupled with a variety of measurement experiments (see Refs. [8, 12]) are given in Table 3.1. These results were obtained by applying three methods of comparison. We suppose that y ∈ R is unknown and dyadicb a is a generated value. The two-sided comparison, also referred as sign comparison (see Ref. [8]), can evaluate both a < y and y < a, the threshold comparison can only evaluate a < y and the vanishing comparison evaluates the logic value of “a < y or y < a”, in other words, it can only tell us if the values are the same or not. a Note that these definitions imply that if C is a class of sets and F a class of advice functions, then C/F ⊆ C/F , since the prefix function used in C/F can be used as the advice function in C/F . b A number is dyadic if it has a finite binary representation.

page 64

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

Space Bounded Scatter Machines

65

Table 3.1. Table of complexity classes of polynomial time Turing machines with different types of experiments as described in the text, considered with different concepts of precision and time tolerance. Type of Oracle Two-sided

Threshold

Vanishing Type 1 (Parallel) Vanishing Type 2 (Clock)

Infinite

Unbounded

Fixed

lower bound

P/log

BPP// log

BPP// log

upper bound

P/poly

P/poly

P/poly

(w/exponential T )

P/log

BPP// log

BPP// log

lower bound

P/log

BPP// log

BPP// log

upper bound

−−−

−−−

−−−

(w/exponential T )

P/log

BPP// log

BPP// log

lower bound

P/poly

P/poly

BPP// log

upper bound (w/exponential T )

P/poly −−−

P/poly −−−

BPP// log −−−

lower bound upper bound

P/log P/poly

BPP// log P/poly

BPP// log BPP// log

(w/exponential T )

−−−

BPP// log

−−−

Notes: Two-sided experiments have been considered in Refs. [6, 7, 10, 22–24] and threshold experiments in Ref. [26]. In the table “w/exponential time” stands for time of consultation exponential in the size of the query.

The main results of this paper will be lower and upper bounds for decidable sets with infinite, arbitrary and fixed precision when using the Sharp Scatter Machine (ShSM) and the Smooth Scatter Machine (SmSM) with the standard protocol (see Refs. [6, 7]) and a new kind of protocol. In this paper we prove the two main theorems: Theorem 1. A set A is decidable by a ShSM or a SmSM bounded in polynomial space, equipped with the standard protocol of infinite precision if and only if A ∈ PSPACE/poly. A set A is decidable by a ShSM or a SmSM bounded in polynomial space, equipped with the standard protocol of arbitrary or fixed precision if and only if A ∈ BPPSPACE//poly. Theorem 2. All sets are decidable by a scatter machine bounded in polynomial space without time schedule, equipped with the generalized protocol of infinite precision. A set A is decidable by a ShSM or a SmSM bounded in polynomial space, equipped with the generalized

page 65

August 2, 2021

66

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

protocol of infinite precision and with time schedule, if and only if A ∈ PSPACE/poly. In Section 3.3, we give an introduction to the scatter experiment, the protocols and the scatter machine. In Section 3.4, we introduce Probabilistic trees, sparse oracles, the coding (and decoding) of the vertex position, and study the bounds for the standard protocol. In Section 3.5, we discuss the new protocol. 3.3. The Experiment, the Protocols and the Machine We intend to use the scatter experiment as the oracle to a Turing machine: whenever the Turing machine needs to consult the oracle, it calls for a value with a certain precision and, after a certain amount of time, it receives an answer from the analogue component. This precision is what gives the Turing machine a stochastic behavior. The oracle aims to measure the unknown position of a vertex in a wedge with the help of a cannon and two collecting boxes, one at each side of the wedge. We set the cannon position according to the word on the query tape and the cannon shoots particles towards the wedge. If the particle is collected by the left box, we will know that the particle hit the wedge on the left of the vertex, otherwise we will know that the particle hit the wedge on the right of the vertex. With this information, the machine can then continue its computations and write new words on the query tape in hope it will get a better approximation of the vertex position (see Refs. [6, 8, 9, 24]). We consider two different versions of the experiment in Figures 3.1 and 3.2: the sharp and the smooth versions which consist of an experiment with a sharp or a smooth wedge, respectively (see Refs. [6, 24]). In both versions, we consider y to be the unknown position of the vertex and z the position of the cannon. The sharp wedge makes a 45◦ angle with the cannon line and both collecting boxes so the particle will collide with a box relatively fast. Actually, this time is constant unless z = y. In this case, y would have to be a dyadic number because z is always dyadic since z is the binary word on the query tape and therefore it is finite. Note that the probability

page 66

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

Space Bounded Scatter Machines

page 67

67

right collecting box

v = 10 m/s sample trajectory 1

1

cannon

d = 5 m

z y 0

0 limit of traverse

limit of traverse of cannon

of point of wedge

cannon aims at dyadic z ∈ [0, 1]

left collecting box

d = 5 m

Figure 3.1.

Sharp scatter machine.

of having z = y is zero since there is an infinite amount of dyadic numbers within any subinterval of ]0, 1[. It is important to note that since the collecting time is constant we can modify the values of v, d or d so that this time is that of one machine transition.c However, in the smooth wedge version the collecting time is no longer constant, increasing as the value of z approaches y. In fact, according to Beggs et al.24 it is exponential on |y − z|−1 motivating the use of a clock to measure the time of experiment and making sure it does not run infinitely. This component is called the time schedule and since we are working with polynomial space, we allow it to tick at most an exponential number of ticks. Let us consider the function g : ]0, 1[→ R such that g(x) describes the shape of the wedge of a smooth scatter experiment (considering the bottom-up x c

We assume all transitions of the Turing machine to take the same amount of time which is simply defined as a unit of time.

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

page 68

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

68

right collecting box

φ x

φ

1

1

cannon

v = 10 m/s

d = 5 m

w

V

sample trajectory

z

y 0

0

limit of traverse

limit of traverse of cannon

of point of wedge

cannon aims at dyadic z ∈ [0, 1]

left collecting box

d = 5 m

Figure 3.2.

Smooth scatter machine.

axis). If g(x) is n times continuously differentiable near the vertex position y, with non-zero nth derivative and all the other derivatives until the (n − 1)th vanishing for x = y, then with a shot performed at the position z, the experiment takes a physical time t(z) such that B A ≤ t(z) ≤ |y − z|n−1 |y − z|n−1 for some real constants A, B > 0 when |y − z| is sufficiently small. We assume without loss of generality that A = B = 1 and n = 2 so t(z) = |y − z|−1 (see Ref. [24]). In addition, if T is the required time schedule for the SmSM, for every z ∈ {0, 1}∗ we define the boundary numbers l|z| and r|z| as the two real numbers in ]0, 1[, if they exist, that satisfy the equation t(l|z| ) = t(r|z| ) = T (|z|), with l|z| < y < r|z| . Intuitively, the boundary numbers define the largest interval where if a particle is shot, no answer is given before time runs out. The Turing machine interacts with the experiment via a communication protocol. The parameters are set according to three

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Space Bounded Scatter Machines

b4205-v1-ch03

69

different types of assumptions on the precision: infinite, arbitrary and fixed precision. With infinite precision the experiment can set the parameters exactly as the Turing machine specifies them in the query tape; with arbitrary precision the experiment can commit arbitrarily small errors when setting the parameter z with their limits specified by the Turing machine; lastly with fixed precision, the experiment can set the parameter z with an error which is a priori fixed. To interact with the experiment, the Turing machine has a query tape and some additional states: the “right” and “left” states and, in case we use the smooth version, the “timeout” state. The machine also has the “accepting” and “rejecting” states, which are used to decide whether the machine accepts or rejects the input. The computation tree can then be seen in general as a probabilistic tree where the nodes represent calls to the oracle. For the standard communication protocol, given a word qq1 q2 . . . qn in the query tape, if the machine performs a transition to the query state, then the experiment sets the cannon position to z = 0. q according to one of the following protocols: Protocol 1 (Error-free protocol). The cannon position is set to the real number z (infinite precision). Protocol 2 (Error-prone arbitrary precision protocol). The cannon position is set to a real number in the interval ]z −2|q| , z +2|q| [ by means of uniform distribution. Protocol 3 (Error-prone fixed precision protocol). The cannon position is set to a real number in the interval ]z − ξ, z + ξ[ with uniform distribution, where ξ is a fixed positive real number of the form ξ = 2−N for some N ∈ N. These were the standard communication protocols used in the analogue-digital machines bounded in time (see Refs. [6, 8, 9, 11, 12, 24]). We now introduce the generalized protocols where the query tape is not space restricted but it is restricted in the sense that it must not be used to perform scratch working, and so the Turing

page 69

August 2, 2021

70

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

machine cannot have the transition function overusing the query tape. Similarly to the logarithmic space reduction where the Turing machine is allowed to write up to a polynomial number of bits in the output tape, the scatter machine can write an exponential number of bits in the query tape. As mentioned, with these protocols, either the standard or the generalized, we get for each one a deterministic experiment for the infinite precision and two stochastic experiments corresponding to the arbitrary and fixed precisions. We then have a deterministic and a probabilistic scatter machine which have decision criteria according to the ones described earlier to define BPP and BPPSPACE where we assume without loss of generality that a probabilistic scatter machine has a bounded error less or equal to 1/4. The probabilistic uniform class BPP has the property that given a set A ∈ BPP witnessed by a probabilistic Turing machine M with constant ε and a polynomial q, we can engineer a probabilistic Turing machine (based on M) that decides A with error probability at most 2−q(n) for each input w with |w| = n. We note that this property also works in the current framework. Analogously as to the class BPP, we assume the same result for general probabilistic scatter machines, since the proof technique consists of the fact that by running the Turing machine enough times we can do the same with the scatter machines. To prove the lower bounds for the bounded error, we need to simulate a probabilistic advice Turing machine on a scatter machine (see Ref. [12]) and in order to do so, we notice that a scatter experiment with an error-prone protocold can be used as a coin (see Ref. [6]). We now have the following proposition (see Refs. [7, 9, 11]). Proposition 1 (see Ref. [6]). Given a biased coin with probability of heads p ∈]δ, 1 − δ[, for some δ with 0 < δ < 1/2, and a real number λ ∈]0, 1[, we can generate a sequence of independent fair coin tosses of length n with a linear amount of biased coin tosses on n, up to probability λ. d

That is, arbritrary or fixed error precision.

page 70

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Space Bounded Scatter Machines

b4205-v1-ch03

71

With respect to the upper bounds, we simulate directly the scatter machine by the oracle Turing machine. For each possible call to the oracle we have in advance the probability of the possible answers of the oracle. In the sharp case, with a fixed precision ξ = 2−N , the vertex position yd is dyadic and is inside the accuracy interval; the probability of the answer being left and right corresponds to (yd − (z − ξ))/(2ξ) and (z + ξ − yd )/(2ξ) respectively, and thus, to simulate the oracle call we can simply simulate an event with probability (yd − (z − 2−N ))/2−N +1 . The same expressions are valid when yd is not dyadic; however, in this case the probabilities are not easy to compute. Analogously for the smooth case, if we know that at least one of the boundary numbers is inside the accuracy interval, we can simulate an event with three possible outcomes, each one with a dyadic probability. In fact, a fair coin toss can simulate any event with dyadic probability, taking n units of time where n comes from the dyadic probability k/2n , for some k with 0 ≤ k ≤ 2n .

3.4. The Standard Scatter Machine The main results of this section will be the upper and lower bounds of the computational power of the ShSM and the SmSM with the standard communication protocol.

3.4.1. Probabilistic trees When working with an error-prone protocol, the sharp and the smooth scatter machine obtain approximations of the vertex position using the linear search algorithm (see Ref. [29]). After each run of the search algorithm on the ShSM, the Turing machine is either in state qr or ql , depending on whether the particle was collected by the right box or the left box respectively, so that the oracle consultations of a ShSM can be seen as a binary query tree. In case we have the smooth version SmSC there is a third state qt denoting that there was no result within the time scheduled and the oracle consultations represent a ternary query tree. We are then interested in directed weighted computation trees where the weights represent

page 71

August 2, 2021

72

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

the probability of going from one query state to another. We can now relate two machines that only differ on the oracle calls: Proposition 2 (see Ref. [7]). Let M be a error-prone scatter machine that decides the set A ⊆ Σ with error probability ε ≤ 1/4. If M is a scatter machine that, for an input w with |w| = n, behaves exactly like M except when the oracle is called and for any call the probability of the machines returning the same answer differs by at most 2−p(n)−4 , then M decides A with error probability ε ≤ 3/8. 3.4.2. Sparse oracles An oracle O is said to be sparse if for each n there is at most a polynomial number of words with size n in O. Given a dyadic number y ∈ ]0, 1[ in the terminating form,e and an increasing total function f : N → Σ , we construct a sparse set Oyf . We consider the infinite word y in binary, possibly ending in all zeros, in such a way that, after each |f (n)| digits, we append either the symbol “=” or the symbol “>”, depending on whether y is equal to its binary expansion truncated at that point or not, respectively. We consider Oyf as the prefix-closed set with all the prefixes of such a word. This set is clearly sparse and can even be considered as a tally set. Not stated in common text books is the following proposition: Proposition 3. PSPACE/poly = Osparse PSPACE(O). Proof. First, let us assume A ∈ PSPACE/poly. Then there exists B ∈ PSPACE and an advice function f ∈ poly such that, for every w ∈ Σ , w ∈ A if and only if w, f (|w|) ∈ B. Let MB be an advice Turing machine which decides B in polynomial space and O the set {0n , x ∈ Σ : n ∈ N and x is a prefix of f (n)}. For any n ∈ N and k ∈ {0, 1, . . . , n − 1}, there is at most one prefix x of f (k), 0k , x , with size n. The set O is sparse. Consider the oracle Turing machine M, which for an input w, uses the oracle O to obtain 0|w| , f (|w|) (see Ref. [30]). M then simulates MB over w, f (|w|) and accepts w if and only if MB accepts w, f (|w|). Thus, M decides A. To e

For example, 0.01 instead of 0.001111 . . .

page 72

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Space Bounded Scatter Machines

b4205-v1-ch03

73

get 0|w| , f (|w|) and write w, f (|w|) is enough to use polynomial space on |w| and to simulate MB on w, f (|w|) is enough polynomial space on | w, f (|w|) |, which is polynomial on |w|. We conclude that M operates in polynomial space on |w| and so A ∈ PSPACE(O). The computation time of M on w is the time that for the |w|MB needs input w, f (|w|) plus the time needed to get 0 , f (|w|) and write w, f (|w|). Except for the simulation, this time is polynomial on |w|. Conversely, lets assume that A ∈ PSPACE(O) for some sparse set O and let M be the machine which decides A in polynomial space p with the oracle O. Let mn ∈ N be the number of words of length n in sparse set O. Then, there is a polynomial q such that mn ≤ q(n), n be all the words of length n in for any n ∈ N. Let w1n , w2n , . . . , wm n O and let f be the advice function such that p(n)

1 2 #w12 # · · · #wm # · · · #w1 f (n) = w11 # · · · #wm 1 2

p(n) #w · · · #wm . p(n)

We note that f ∈ poly since for any i ∈ N we have p(n)

|f (n)| ≤

(i + 1)q(i) − 1

i=1

which is a polynomial on n. Consider the advice Turing machine M which for an input w, f (|w|) behaves exactly like M for the input w except when the oracle is called: for a query q it uses the advice f (|w|) to give the same answer as the oracle (if q is in f (|w|) answers yes, otherwise answers no). For an input w, M can only write in the query tape a word q with |q| ≤ p(|w|) so M can answer any query it may need for a computation of M. We conclude that M accepts w, f (|w|) if and only if M accepts w. Since M is space bounded by a polynomial, M needs polynomial space on | w, f (|w|) | to simulate M on w and, to get the answers from the oracle, M would need no further space. Therefore, we conclude that A ∈ PSPACE/poly.

page 73

August 2, 2021

74

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

3.4.3. Coding and decoding the vertex position In order to code the vertex position we recall the Cantor set C3 which is the set of sequences of the form x=

+∞

xk 2−3k

k=1

for xk ∈ {1, 2, 4}. This corresponds to the set of elements that, once in binary form, are composed by the triples 001, 010 or 100. For a set A ∈ BPP//poly let f be the advice function that witnesses it. We set the vertex position as x(f ) = 0 · c(f (0))001 c(f (1))001c(f (2))001 · · · where c(n) is obtained by replacing the 0’s and the 1’s of the binary representation of f (n) by 100 and 010 respectively. Note that the length of c(f (m)) is a linear function on the length of f (m) and that there is no ambiguity in the binary expansion since no dyadic number can occur (see Ref. [6]). To get the digits of x(f ), and consequently of f (0), f (1), . . . we must consider the three types of precision. When working with the infinite precision protocol, we start by shooting the particle from z = 0.011 and if the answer is “left” then y > z and so y starts with 0.110, otherwise y < z and it starts with 0.010. For the next digits we shoot from z = y 011 where y are the already known digits of y, however if the answer is right we need to shoot again from z = y 0101 to know if the next digits are 010 or 001.f The space needed is O(|f (n)|) which is polynomial and, in the sharp version, polynomial time is enough since the oracle calls take a constant time. With the fixed precision, we observe that the distance between a number r = 0.r1 r2 . . . rk . . . and any dyadic rational with denominator 2k is at least 2−k−5 (see Proposition 5) and so, if we take r = 0.x(f ), to determine its first k places we need to determine some number r such that |r − r| ≤ 2−k−5 . If we assume we shoot the cannon u times, then by using the Chebyshev’s inequality we conclude that P(|r − r| > 2−k−5 ) ≤ (22k+10 )/u and if we allow some probability f Using this method, the query will coincide with x(f ) in all digits except for the last two or three.

page 74

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Space Bounded Scatter Machines

b4205-v1-ch03

75

of at most γk of making a mistake we need uk = 22k+10 )/γk cannon shots to determine the first k digits of r. For a word of size n we need to keep extracting digits until we see n + 1 “001” markers, however by making γk constant, it is very likely to make a mistake before we terminate (in fact, we will almost surely make a mistake as n tends to infinity) and so we take γk = γ1 /2k+1 so the probability of making a mistake is now less than γ1 /2. The algorithm becomes setting k = 1, fire the cannon uk times and check if we have found the first k digits of f (n). If not, increase k and repeat. The only problem would be the time taken to preform these shoots but it is polynomial in n (see Ref. [6]). We note that given a prefix advice function f , an error-prone SmSM can obtain m triples of x(f ) with an error probability smaller than 2−c by preforming no more than 26m+8+c calls to the oracle (see Refs. [9,29]). Finally, a machine working with the arbitrary precision protocol will obtain the digits of f (n) by simulating an error-free scatter machine (see Proposition 4). With a sharp wedge, each experiment takes a constant time, but for the smooth version case, to obtain m triples of x(f ) we need less than 23m+3 transitions, and since at the ith call we have |y − z| > 2−3i−2 , we conclude that the ith call needs at most 23i+2 units of time to be answered and therefore we need at most exponential time. We now state a nice result that relates infinite precision with arbitrary precision: Proposition 4 (see Ref. [9]). Let q be a query to the linear search method of a scatter machine with vertex position y. The error-prone arbitrary precision scatter experiment, with the precision set to at least 2−|q|−3 , has the same result as the error-free scatter experiment. In the smooth case of the experiment, the time is at most doubled when going from infinite precision to arbitrary precision. The next proposition allows us to obtain an accurate enough approximation of the vertex position which will then imply that a scatter machine working with a fixed protocol can obtain the vertex position with an error as small as we want.

page 75

August 2, 2021

76

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

Proposition 5 (see Ref. [6]). Let z ∈ ]0, 1[ be a real number and y = 0.x(f ) a number in the Cantor Set, for some prefix advice function f . If |y − z| < 2−n−5 then the first n digits of the binary form of z and y coincide. The scatter machine will now find a real number close enough to the vertex position so that we get the wanted amount of digits by performing at most 26m+8+c oracle calls, where c is such that ε ≥ 2−c (see Section 3.4.3). An analogous argument to the previous one shows that a ShSM will only need polynomial time but a SmSM will need exponential time. We can now conclude that for the fixed precision protocol, both the sharp and the smooth scatter machines can obtain the same information from the vertex position. 3.4.4. Lower bounds We start by noticing that if C is a class of sets and F is a class of advice functions, then C/F ⊆ C/F since the set B and the prefix advice function f ∈ F that witness A ∈ C/F in particular hold the condition for |w| = n. However, in certain conditions the converse is also true: Proposition 6. Let C be a class of sets decidable by deterministic Turing machines clocked in polynomial or superpolynomial time. We have that C/poly ⊆ C/poly . Proof. If A ∈ C/poly, then there exist a set B ∈ C and an advice function f ∈ poly such that w ∈ A if and only if w, f (|w|) ∈ B. Let MB be the Turing machine which decides B in C and f ∈ poly be the prefix advice function such that f (n) = f (0)001f (1)001 · · · 001f (n)001. Consider the advice Turing machine M such that on input w, f (n), for some n ≥ |w| = r, reads f (n) in triples of digits until the rth occurrence of 001 to find the decoding y of z, where z ∈ Σ is the word confined between the rth and the rth + 1 occurrences of 001 in f (n). M will then simulate MB on w, y accepting if and only if w, y ∈ B. Thus, M accepts w, f (n) if and only if w ∈ A. Since MB decides a set in C and

page 76

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

Space Bounded Scatter Machines

77

retrieving y from f (n) requires a polynomial amount of time, we conclude that M decides a set also in C. The scatter machine bounded in polynomial space can decide all the sets that the scatter machine clocked in polynomial time can decide, since the scatter machine bounded in polynomial space p can simulate the scatter machine clocked in polynomial time p. We conclude that the upper bounds of scatter machines clocked in polynomial time are at least included in the corresponding lower bounds of scatter machines bounded in polynomial space (see Refs. [6, 7, 9]):

ShSM SmSM

Infinite P/poly P/log

Arbitrary P/poly BPP// log

Fixed BPP// log BPP// log

However, it is possible to get better lower bounds as one can see from the following three propositions: Proposition 7 (Standard error-free ShSM). If A ∈ PSPACE/ poly, then A is decidable by an error-free ShSM in polynomial space. Proof. Since, by Proposition 6, PSPACE/poly = PSPACE/poly, we conclude that A ∈ PSPACE/poly and so there exists B ∈ PSPACE and a prefix advice function f ∈ poly such that for every n ∈ N and every w ∈ Σ with |w| ≤ n, w ∈ A if and only if w, f (n) ∈ B. Let MB be an advice Turing machine which decides B in polynomial space and consider the error-free ShSM M with vertex position y = 0.x(f ) (see Section 3.4.3), which for an input w with |w| = n, consults the oracle to obtain f (n), that is, the first |f (n)| triples of x(f ). The positioning of the vertex and the process of extracting it are explained in Section 3.4.3 (see also Refs. [6, 27]). M simulates MB over w, f (n) and if MB accepts w, f (n), then M accepts w, otherwise M rejects w. Since M accepts w if and only if MB accepts w, f (n), we conclude that the scatter machine M decides A. We now see that f ∈ poly, n = |w| and | w, f (n) | ∈ O(|w| + |f (n)|) so we have |x(f (n))| and | w, f (n) | are polynomial on n. Thus, M needs polynomial space

page 77

August 2, 2021

78

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

on n to obtain f (n) by means of the experiment, or to write x(f (n)) or w, f (n) in a tape. To simulate MB over w, f (n), M needs polynomial space on | w, f (n) | since MB runs in polynomial space. We conclude that M operates in polynomial space. Proposition 8 (Standard arbitrary precision ShSM). If A ∈ BPPSPACE//poly, then A is decidable by an error-prone arbitrary precision ShSM in polynomial space. Proof. Since, by Proposition 6, BPPSPACE//poly = BPPSPACE// poly, we conclude that A ∈ BPPSPACE//poly. Let M be the probabilistic advice Turing machine bounded in polynomial space, f ∈ poly the prefix advice function and ε a constant such that 0 < ε < 1/2 that witness A ∈ BPPSPACE//poly. Consider the probabilistic error-prone arbitrary precision ShSM M with vertex position y = 0.x(f ), which for an input w with |w| = n, it runs the scatter experiment to obtain f (n) as explained in Section 3.4.3 (see also Ref. [6]). M simulates M over w, f (n) and accepts w if and only if M accepts w, f (n) (see Proposition 4). We note that the time needed to perform the experiments and write x(f (n)) and w, f (n) is polynomial on n and the time needed to simulate the probabilistic component with the oracle and simulate M over w, f (n) can be at most exponential on n. As M can simulate M using its probabilistic oracle, it does not need to have a probabilistic digital component since this component would be the digital component of M. We then conclude that M is making a mistake to decide A with probability at most ε, so it is deciding A. Since f ∈ poly, n = |w| and | w, f (n) | ∈ O(|w| + |f (n)|) we have |x(f (n))| and | w, f (n) | are polynomial on n. Thus, to obtain f (n) with the experiment or to write x(f ) or w, f (n) in a tape M needs polynomial space on n. To simulate M over w, f (n), M needs polynomial space on | w, f (n) | since M is bounded in polynomial space and the simulation of the probabilistic component can be done in polynomial space. We can conclude that M operates in polynomial space.

page 78

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Space Bounded Scatter Machines

b4205-v1-ch03

79

Proposition 9 (Standard fixed precision ShSM). If A ∈ BPPSPACE//poly, then A is decidable by an error-prone finite precision ShSM in polynomial space. Proof. Since, by Proposition 6, BPPSPACE//poly = BPPSPACE// poly, we conclude that A ∈ BPPSPACE//poly. Let M be the probabilistic advice Turing machine bounded in polynomial space, f ∈ poly the prefix advice function and ε the constant such that 0 < ε < 1/2 that witness A ∈ BPPSPACE//poly. Consider the probabilistic error-prone finite precision ShSM M with fixed error ξ and vertex position y = 1/2 + ξ − 2ξy, where y = 0.x(f ), as described in Section 3.4.3 (see also Ref. [6]) and a constant δ with ε + δ < 1/2. Recalling Section 3.4.3, on an input w with |w| = n, M can obtain f (n) with error probability smaller than δ. We define M to, after getting f (n), simulate M over w, f (n) and accepts w if and only if M accepts w, f (n). As M can simulate M using its probabilistic oracle, it does not need to have a probabilistic digital component. We conclude then that M fails to obtain f (n) or simulate a wrong computation of M with probability at most δ + ε < 1/2. Therefore, M decides A. Since f ∈ poly, n = |w| and | w, f (n) | ∈ O(|w| + |f (n)|) we have |x(f (n))| and | w, f (n) | are polynomial on n. Thus, to obtain f (n) with the oracle or to write x(f (n)) or w, f (n) in a tape M needs polynomial space on n. To simulate M over w, f (n) M needs polynomial space on | w, f (n) |, since M is bounded in polynomial space and the simulation of the probabilistic component can be done in polynomial space. So, we can conclude that M operates in polynomial space. To compute lower bounds of the smooth scatter machine we have to guarantee that the experiment does not run indefinitely. Accordingly to the linear search algorithm described in Section 3.4.1 (see also Refs. [9,24]), a SmSM does not shoot at the vertex position, and although it converges to that position, it shoots at a far enough distance so that the physical time of each shot is at most exponential on the query size and so, the space used by the clock is at most

page 79

August 2, 2021

80

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

polynomial. Therefore, there is no need to use a time schedule on the next two propositions. Proposition 10 (Standard error-free SmSM). If A ∈ PSPACE/ poly, then A is decidable by an error-free SmSM in polynomial space. Proof. Since, by Proposition 6, PSPACE/poly = PSPACE/poly we conclude that A ∈ PSPACE/poly. Let B ∈ PSPACE, MB be the probabilistic advice Turing machine bounded in polynomial space that decides B and f ∈ poly the prefix advice function that witness A ∈ BPPSPACE//poly. Consider the error-free SmSM M with vertex position y = 0.x(f ) (see Section 3.4.3), which for an input w with size |w| = n, consults the oracle to obtain f (n). With the search method described in Refs. [9, 24], any call to the oracle has a physical time at most exponential on the query size. M then simulates MB over w, f (n) and accepts w if and only if MB accepts w, f (n). By construction we conclude that the scatter machine M decides A. Since f ∈ poly, n = |w|, and | w, f (n) | ∈ O(|w| + |f (n)|) we have that |x(f (n))| and | w, f (n) | are polynomial on n. Thus, M needs polynomial space on n to obtain f (n) by means of the experiment or to write x(f (n)) or w, f (n) in a tape. To simulate MB over w, f (n), M needs polynomial space on | w, f (n) | since MB runs in polynomial space. We conclude that M operates in polynomial space. Proposition 11 (Standard arbitrary precision SmSM). If A ∈ BPPSPACE//poly, then A is decidable by an error-prone arbitrary precision SmSM in polynomial space. Proof. Since, by Proposition 6, BPPSPACE//poly = BPPSPACE// poly we conclude that A ∈ BPPSPACE//poly. Let M be the probabilistic advice Turing machine bounded in polynomial space, f ∈ poly the prefix advice function and ε a constant such that 0 < ε < 1/2 that witnesses A ∈ BPPSPACE//poly. Consider the probabilistic error-prone arbitrary precision SmSM M with vertex position y = 0.x(f ) (see Section 3.4.3) and error ξ = 2−|q|−3 where q is the word in the query tape, which for an input w with size |w| = n, consults the oracle to obtain f (n). With the linear search method

page 80

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Space Bounded Scatter Machines

b4205-v1-ch03

81

(see Refs. [9, 24]), any call to the oracle has a physical time at most exponential on the query size. M then simulates M over w, f (n) and accepts w if and only if M accepts w, f (n) (see Proposition 4). As M can simulate M using its probabilistic oracle, it does not need to have a probabilistic digital component. We conclude that M is making a mistake to decide A with probability at most ε, so it is deciding A. Since f ∈ poly, n = |w| and | w, f (n) | ∈ O(|w|+|f (n)|) we have |x(f (n))| and | w, f (n) | polynomial on n. Thus, M needs polynomial space on n to obtain f (n) with the experiment or to write x(f (n)) or w, f (n) in a tape. To simulate M over w, f (n), M needs polynomial space on | w, f (n) | since M is bounded in polynomial space and the simulation of the probabilistic component can be done in polynomial space. So, we conclude that M operates in polynomial space. Proposition 12 (Standard fixed precision SmSM). If A ∈ BPPSPACE//poly, then A is decidable by an error-prone fixed precision SmSM in polynomial space, which makes use of the time schedule.

Proof. Just as in the sharp version case, this proof is very similar to the proof for arbitrary precision. Once again we use y = 1/2 + ξ − 2ξy as the vertex position and consider a constant δ with 0 < δ < 1/2 − ε so that M fails to obtain f (n) or simulate a wrong computation of M with probability at most δ + ε < 1/2, therefore deciding A. The computation time is the time that M needs for the input w, f (n) simulating the oracle, which is exponential, plus the time needed to perform the experiments and write x(f (n)) and w, f (n) which is also exponential and therefore we can still use Chebyshev’s theorem to get the necessary number of calls to the oracle in order to get the desired approximation to the vertex position. We summarize the lower bounds obtained above in the following table:

page 81

August 2, 2021

16:18

82

ShSM SmSM

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

Infinite PSPACE/poly PSPACE/poly

Arbitrary Fixed BPPSPACE//poly BPPSPACE//poly BPPSPACE//poly BPPSPACE//poly

3.4.5. Upper bounds Upper bounds will be established by direct simulation of the scatter machine in the oracle Turing machine. For that purpose, we need to implement a method to compare a dyadic rational q ∈ ]0, 1[ of size n with a real number b ∈ ]0, 1[ (both in binary). A Turing machine can decide if either q ≤ b or q ≥ b by comparing them digit by digit,g but it may not be able to decide between strict and non-strict inequalities. However, if we know if b = bn , b = bn + 2−n or neither of them,h we can decide between q < b, q > b or q = b. Proposition 13 (Standard error-free ShSM). If a set A ⊆ Σ is decidable by an error-free ShSM in polynomial space, then A ∈ PSPACE/poly. Proof. Let M be an error-free ShSM bounded in polynomial space p which decides A with vertex position at y and let Oyp be the sparse set as described in Section 3.4.2 with a polynomial p given by p (0) = p(0) and p (n) ≥ max{p(n), p (n − 1)}, for any n ∈ N, with n ≥ 1. Consider the oracle Turing machine M that, for an input w with |w| = n, consults the oracle Oyp to obtain its word “o” of length p(n) + n + 1. M then simulates M on w using o to answer the queries. For any input w, M can only write in the query tape a query q such that |q| ≤ p(|w|) so any possible query of M can be answered comparing it to o since o ends with either an ‘=’ or a ‘>’ sign and no query q can have more numeric digits than o, thus concluding that M decides A. To simulate M over w answering the oracle calls with the word o, M needs polynomial space on |w| since M is bounded in polynomial space and no space is needed for the comparison of a g

This comparison can be done in finite time since q has a finite binary representation. h We denote by xn , when x ∈ ]0, 1[, the first n digits of the binary form of x, the first n digits after the dot.

page 82

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Space Bounded Scatter Machines

b4205-v1-ch03

83

query q with o. To get o out of Oyp , M needs polynomial space on |w| since |o| = p(|w|) + |w| + 1. We can conclude that M operates in polynomial space and, by means of the Proposition 3, we get the desired result A ∈ PSPACE/poly. Proposition 14 (Standard arbitrary precision ShSM). If a set A ⊆ Σ is decidable by an error-prone arbitrary precision ShSM in polynomial space, then A ∈ BPPSPACE//poly. Proof. Let M be an error-prone arbitrary precision ShSM bounded in polynomial space p which decides A with vertex position y. We assume, without loss of generality, that M has error probability ε ≤ 1/4. Let f be the advice function such that f (n) = y2p(n)+3 , which is clearly in poly. We have that |y − 0.y2p(n)+3 | ≤ 2−2p(n)−3 and that 0.y2p(n)+3 is a dyadic rational. Consider the probabilistic advice Turing machine M which, for an input w, f (|w|) with |w| = n, behaves exactly like M for the input w (see Proposition 4) except when the oracle is called, where for a query q, it uses the advice function f (n) to digitally simulate the probabilistic answer from the oracle and thus, M accepts w, f (|w|) if its simulation of M accepts w. As M is bounded in polynomial space by p, for an input w, it can only write in the query tape a word with |q| ≤ p(|w|) and according to the protocol, the uncertainty to the cannon position is greater or equal to 2 · 2−p(|w|) . As also |y − 0.y2p(n)+3 | ≤ 2−2p(n)−3 we conclude that, for any call to the oracle, the difference of the probabilities for the answer between M and M is at most 2−p(n)−4 . Thus, M is simulating a ShSM as M, but with mandatory dyadic vertex position and close enough probability values. Therefore, calling M to that ShSM, by Proposition 2, we conclude that M decides A with error probability ε ≤ 3/8 and so, M accepts w, f (|w|) with error probability ε ≤ 3/8. We can conclude that M decides A. To simulate M over w with the oracle calls simulated digitally, M needs polynomial space on | w, f (|w|) | since M is bounded in polynomial space and the oracle is simulated in polynomial space on |w|. We then conclude that M operates in polynomial space and so A ∈ BPPSPACE//poly.

page 83

August 2, 2021

84

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

Proposition 15 (Standard fixed precision ShSM). If a set A ⊆ Σ is decidable by an error-prone fixed precision ShSM in polynomial space, then A ∈ BPPSPACE//poly. Proof. Let M be an error-prone fixed precision ShSM which decides A with vertex position y bounded in polynomial space p. We assume, without loss of generality, that M has error probability ε ≤ 1/4 and fixed error ξ = 2−N for some N ∈ N, giving 2 · 2−N as the uncertainty to the cannon position. An analogous argument to the previous proof will work since we can use f (n) = yp(n)+3+N as the advice function because 0.yp(n)+3+N is a dyadic rational and |y − 0.yp(n)+3+N | ≤ 2−p(n)−3−N which means that the difference of the probabilities for the answer between M and M is at most 2−p(n)−4 since N ≥ 1, and so we can use Proposition 2. Proposition 16 (Standard error-free SmSM). If a set A ⊆ Σ is decidable by an error-free SmSM in polynomial space, using the time schedule, then A ∈ PSPACE/poly. Proof. Let M be an error-free SmSM which decides A with vertex position y and time schedule T bounded in polynomial space p. Consider, for any (input size) n ∈ N, the boundary numbers of M, ln and rn , and we set fnl = 0 if ln = ln n , fnl = 1 if ln = ln n + 2−n and fnl = ε otherwise. Analogously, we set fnr = 0 if rn = rn n , fnr = 1 if rn = rn n + 2−n and fnr = ε otherwise. Let f be the prefix advice function such that l r #rp(n) p(n) #fp(n) f (n) = l1 1 #f1l #r1 1 #f1r # · · · #lp(n) p(n) #fp(n)

which is clearly in poly. Consider the advice Turing machine M which, for an input w, f (|w|), behaves exactly like M for the input w except when the oracle is called, where for a query q uses the approximations of the boundary numbers l|q| |q| and r|q| |q| and the l and f r in the advice f (|w|) to compare them with q and digits f|q| |q| give the same answer as the oracle. For an input w, M can only write in the query tape a word q with |q| ≤ p(|w|) and so, M can compare any query it may need for a computation of M. We conclude

page 84

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Space Bounded Scatter Machines

b4205-v1-ch03

85

then that M accepts w, f (|w|) if and only if M accepts w, thus M decides A. To simulate M over w with the oracle calls answered digitally M needs polynomial space on | w, f (|w|) |, since M is bounded in polynomial space and the oracle simulation needs no further space. We conclude then that M operates in polynomial space and thus A ∈ BPPSPACE//poly. Proposition 17 (Standard arbitrary precision SmSM). If a set A ⊆ Σ is decidable by an error-prone arbitrary precision SmSM in polynomial space, using the time schedule, then A ∈ BPPSPACE//poly. Proof. Let M be an error-prone arbitrary precision SmSM which decides A with vertex position y and time schedule T bounded in polynomial space p. We assume, without loss of generality, that M has error probability ε ≤ 1/4. The proof is simlilar to the proof of Proposition 16 but we take f (n) = l1 p(n)+3+1 #r1 p(n)+3+1 # · · · #lp(n) p(n)+3+p(n) #rp(n) p(n)+3+p(n)

which is also in poly, and for a query q, it uses the approximations of the boundary numbers l|q| p(|w|)+3+|q| and r|q| p(|w|)+3+|q| . According to the protocol, for a query q, the uncertainty to the cannon position is 2 · 2−|q| . As |l|q| − 0.l|q| p(n)+3+|q| | ≤ 2−p(n)−3−|q| and |r|q| − 0.r|q| p(n)+3+|q| | ≤ 2−p(n)−3−|q| , and also 0.r|q| p(n)+3+|q| ≤ r|q| , we conclude that, for any call to the oracle, the difference of the probabilities for the answer between M and M is at most 2−p(n)−4 , since |q| ≥ 1. Thus, M is simulating a SmSM as M, but with mandatory dyadic boundary numbers and close enough probability values. Therefore, calling M to that SmSM, by Proposition 2, we conclude that M decides A with error probability ε ≤ 3/8 and so, M accepts w, f (|w|) with error probability ε ≤ 3/8. We conclude then that M decides A. To simulate M over w with the oracle calls simulated digitally M needs polynomial space on | w, f (|w|) |, since M is bounded in polynomial space and the oracle is simulated in polynomial space on |w|. We conclude then that M operates in polynomial space and thus A ∈ BPPSPACE//poly.

page 85

August 2, 2021

16:18

86

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

Proposition 18 (Standard fixed precision SmSM). If a set A ⊆ Σ is decidable by an error-prone fixed precision SmSM in polynomial space, using the time schedule, then A ∈ BPPSPACE//poly. Proof. The proof is again similar to proofs already presented, in particular the proof of Proposition 17. The only details that differ are the fixed error ξ = 2−N for some N ∈ N, the prefix advice function f (n) = l1 p(n)+3+N #r1 p(n)+3+N # · · · #lp(n) p(n)+3+N #rp(n) p(n)+3+N

and the machine using the approximations of the boundary numbers l|q| p(|w|)+3+N and r|q| p(|w|)+3+N . The results for the upper bounds can be seen in the following table:

ShSM SmSM

Infinite PSPACE/poly PSPACE/poly

Arbitrary Fixed BPPSPACE//poly BPPSPACE//poly BPPSPACE//poly BPPSPACE//poly

Comparing these results with the ones obtained with respect to the lower bounds, we can conclude that the upper bounds coincide with the lower bounds indicating that the sets that are decidable with a scatter machine with infinite precision are the sets in PSPACE/poly and the sets decidable by an error-prone scatter machine are the sets in BPPSPACE//poly. Theorem 4.1. A set A ⊆ Σ is decidable by an error-free ShSM (or SmSM) using the standard protocol and bounded in polynomial space if and only if A ∈ PSPACE/poly. Theorem 4.2. A set A ⊆ Σ is decidable by an error-prone ShSM (or SmSM) using the standard protocol and bounded in polynomial space if and only if A ∈ BPPSPACE//poly.

page 86

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Space Bounded Scatter Machines

b4205-v1-ch03

87

3.5. The Generalized Scatter Machine As done in the standard protocol case, we start by importing some lower bounds for the generalized scatter machine bounded in polynomial space since the generalized scatter machine can decide all the sets a standard scatter machine decides. We can state then the following lower bounds for the analogue-digital sharp scatter machine and the analogue-digital smooth scatter machine bounded in polynomial space: ShSM SmSM

Infinite PSPACE/poly PSPACE/poly

Arbitrary Fixed BPPSPACE//poly BPPSPACE//poly BPPSPACE//poly BPPSPACE//poly

3.5.1. Lower bounds In order to compute lower bounds for the computational power of the ShSM when using the generalized protocol, we need an advice function f : N → Σ constructed over a set A ⊆ Σ in such a way that f (n) = f1n f2n · · · f2nn , where fin is either 0 or 1 depending on if the ith word with size n of Σ , ordered lexicographically, belongs or not to the set A, respectively. For any n ∈ N we have |f (n)| = 2n . Proposition 19 (Generalized error-free ShSM). Any set A ⊆ Σ is decidable by an error-free ShSM in polynomial space. Proof. Consider a set A ⊆ Σ lexicographically ordered. Let f be an advice function for the set A and let g be the prefix advice function such that g(n) = f (0)#f (1)# · · · #f (n)#. The advice word g(n) is given with two symbols from Σ for each of the symbols 0, 1 or #, so that we have the advice words over Σ and we are able to apply the encoding x to g. Since |f (n)| = 2n , we have that |g(n)| = 2 · (2n+1 − 1) + 2n for any n ∈ N. Consider the error-free ShSM M with vertex position y = 0.x(g) which for an input w with |w| = n computes the position of w within all the n sized words in Σ ordered lexicographically. Assume that w is the ith word. M then performs 2n+1 − 1 + 2n + 2i oracle calls with the linear search method

page 87

August 2, 2021

88

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

(which makes the query word into q = x(f (0)#f (1)# · · · #f (n − 1)#f1n f2n · · · fin )) and accepts w if and only if the answer from the last call is “right”, meaning that fin = 1 and so w ∈ A, concluding that M decides A. To compute the position of w within the n sized words in Σ ordered lexicographically, count the 2n+1 − 1 + 2n + 2i oracle calls and to perform them, M needs polynomial space on n since the oracle takes a constant amount of time for each call. So, we can conclude that M operates in polynomial space. Proposition 20 (Generalized arbitrary precision ShSM). Any set A ⊆ Σ is decidable by an error-prone arbitrary precision ShSM in polynomial space. Proof. The proof of this proposition is, almost in every detail, similar to the proof of Proposition 19. The remaining difference is that, when using the oracle with the linear search method and the error-prone arbitrary precision protocol, we need to pad the query word with three 0’s in order to get the desired error (accuracy interval) and obtain the same results as with the error-free scatter machine (see Proposition 19). Proposition 21 (Generalized fixed precision ShSM). If a set A ∈ BPPSPACE//poly, then A is decidable by an error-prone fixed precision ShSM in polynomial space. Proof. Using the generalized protocol for the fixed precision assumption we are not able to access an exponentially long advice in polynomial space. Due to the necessity of counting the oracle calls used in the Chebyshev’s inequality, we need an exponential space to be able to count the calls needed to get an exponential amount of triples from the binary form of the vertex position. For an input of size n, we need an exponential space on n (we need 6 · 2n + 8 + c + 1 n cells of a tape) to be able to count 26·2 +8+c oracle calls so that we would get 2n triples from the binary expansion of the vertex position with error probability smaller than 2−c . Note that we should be able to get 2n+1 −1+2n+2i triples, where 1 ≤ i ≤ 2n , to make a proof like the previous two. Therefore, an error-prone fixed precision ShSM in

page 88

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Space Bounded Scatter Machines

b4205-v1-ch03

89

polynomial space decides the same with the generalized protocol as it decides with the standard one (the same happens with the smooth scatter experiment). In the smooth version case we have two options: we can either use or not a time schedule to end the scatter experiment. If we use it, then the scatter machine has the same computational power that it has when using the standard protocol and so we can decide PSPACE/poly or BPPSPACE//poly with a SmSM with the error-free protocol or the arbitrary precision protocol respectively (see Theorems 4.1 and 4.2). Without the time schedule, we have the following lower bounds: Proposition 22 (Generalized error-free SmSM). Any set A ⊆ Σ is decidable by an error-free SmSM in polynomial space. Proof. The proof of this proposition is, almost in every detail, similar to the proof of Proposition 19. The remaining difference is that, when consulting the physical experiment with the linear search method of Subsection 3.4.3 and the smooth scatter experiment, the scatter machine takes more than a constant physical time but it can get the same answers from the experiment and the same computational results. Proposition 23 (Generalized arbitrary precision SmSM). Any set A ⊆ Σ is decidable by an error-prone arbitrary precision SmSM in polynomial space. Proof. The proof of this proposition is similar to the proof of Proposition 22, the difference being that, when using the oracle with the linear search method and the error-prone arbitrary precision protocol, we need to pad the query word with three 0’s in order to get the desired error (accuracy interval) and obtain the same results as with the error-free scatter machine. Proposition 24 (Generalized fixed precision SmSM). If a set A ∈ BPPSPACE//poly, then A is decidable by an error-prone finite precision SmSM in polynomial space, which makes use of the time schedule.

page 89

August 2, 2021

16:18

90

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

page 90

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

Proof. The proof of this proposition is exactly the same as the proof of Proposition 12. The relevant difference between the generalized and the standard protocols is that we can use an unbounded amount of cells from the query tape with the generalized protocol, even with a bounded space scatter machine. For the lower bound with the errorprone fixed precision protocol we do not use this difference since all the oracle calls with the purpose of obtaining the vertex position are performed with the same query. The results regarding lower bounds with the generalized protocol when not using a time schedule are summarized in the following table: Infinite ShSM SmSM

2

Σ

2Σ

Arbitrary 2

Fixed

Σ

BPPSPACE//poly

BPPSPACE//poly

2Σ

These are very strong results, namely for the error-free and the arbitrary precision protocols since the lower bound obtained this new generalized protocol contains all sets and therefore it must coincide with the upper bounds. It also represents a great improvement with respect to the much studied standard protocol. When it comes to the fixed precision protocol, the lower bound coincides with the lower bound (and upper bound) obtained with the standard protocol. 3.5.2. Upper bounds Proposition 25 (Generalized error-free ShSM). All sets A ∈ Σ are decidable by an error-free ShSM bounded in polynomial space. Proof. It follows directly from Proposition 19.

Proposition 26 (Generalized arbitrary precision ShSM). All sets A ∈ Σ are decidable by an error-prone arbitrary precision ShSM bounded in polynomial space. Proof. It follows directly from Proposition 20.

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Space Bounded Scatter Machines

b4205-v1-ch03

page 91

91

Proposition 27 (Generalized fixed precision ShSM). If a set A ⊆ Σ is decidable by an error-prone finite precision ShSM in polynomial space, then A ∈ BPPSPACE//poly. Proof. The proof of this proposition is exactly the same as the proof of Proposition 15 but now we can use an unbounded amount of cells from the query tape with the generalized protocol, even with a bounded space scatter machine. We aim to apply Proposition 2, so we need to simulate the scatter machine with a difference from the real probability values for the experiment answers of at most 2−p(n)−4 . Thus, we only need a polynomial amount of digits from the query in order to probabilistically choose a shot position to compare with the vertex position, keeping the approximate probabilities close enough to the real ones. Proposition 28 (Generalized error-free SmSM (without time schedule)). All sets A ∈ Σ are decidable by an error-free SmSM bounded in polynomial space without using the time schedule. Proof. It follows directly from Proposition 22.

Proposition 29 (Generalized arbitrary precision SmSM (without time schedule)). All sets A ∈ Σ are decidable by an error-prone arbitrary precision SmSM bounded in polynomial space without using the time schedule. Proof. It follows directly from Proposition 23.

In order to compute upper bounds when using the time schedule, we first need to introduce a new proposition. Proposition 30. A set A ∈ Σ is decidable by a generalized smooth scatter machine bounded in polynomial space, using the time schedule, if and only if it is decidable by a standard smooth scatter machine bounded in polynomial space, using the time schedule. Proof. A set decidable by a standard scatter machine is also decidable by a generalized scatter machine. Let M be a generalized SmSM bounded in polynomial space, using a time schedule, which

August 2, 2021

92

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

page 92

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

may use queries as long as the input, then it can use at most an exponential time schedule since the space is polynomial. We can then consider a clock that uses polynomial space on the size of their inputs and thus, use it as a time schedule that uses polynomial space on the size of the queries. We conclude then that in order to have M running in polynomial space we need to consider queries of at most polynomial size. If M only uses polynomially sized queries, then a standard SmSM can use exactly the same queries M uses in any of its computations. Therefore, a standard SmSM can decide all the sets that M decides, and the desired equivalence follows. We now have the necessary tools to prove the upper bounds for the SmSM when using a time schedule. Proposition 31 (Generalized error-free SmSM (with time schedule)). A set A ⊆ Σ is decidable by an error-free SmSM in polynomial space, using an exponential time schedule, if and only if A ∈ PSPACE/poly. Proof. It follows from Propositions 10, 16 and 30.

Proposition 32 (Generalized arbitrary precision SmSM (with time schedule)). A set A ⊆ Σ is decidable by an error-prone arbitrary precision SmSM in polynomial space, using an exponential time schedule, if and only if A ∈ BPPSPACE//poly. Proof. It follows from Propositions 11, 17 and 30.

Proposition 33 (Generalized fixed precision SmSM (with time schedule)). If a set A ⊆ Σ is decidable by an error-prone fixed precision SmSM in polynomial space, using the time schedule, then A ∈ BPPSPACE//poly. Proof. The proof of this proposition is exactly the same as the proof of Proposition 18. Again, the difference is that now we can use an unbounded amount of cells from the query tape with the generalized protocol, even with a bounded space scatter machine. For the upper bound with the error-prone fixed precision protocol we aim to apply Proposition 2, so we need to simulate the scatter

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

Space Bounded Scatter Machines

93

machine with a deviation from the real probability values for the oracle answers of at most 2−p(n)−4 . Thus, we only need a polynomial amount of digits from the query in order to probabilistically choose a shot position to compare with the vertex position, keeping the approximate probabilities close enough to the real ones. The following table shows the obtained results for the upper bounds of the computational power of Scatter Machines when using the generalized protocol. Infinite

Arbitrary

Fixed

ShSM

2Σ

2Σ

SmSM

PSPACE/poly

BPPSPACE//poly BPPSPACE//poly

BPPSPACE//poly

(with time schedule)

SmSM (with/time schedule)

2Σ

2Σ

−−−

Regarding the upper bounds for the ShSM and the SmSM without time schedule, the obtained results are what we expected since the lower bounds are already very constraining (see Section 3.5.1). When using the time schedule, we get the same results as for the standard communication protocol, which can be explained by the fact that, even though the machine is not using space when performing the physical experiment, the clock will still use space when ticking. 3.6. Conclusion The purpose of this chapter was to study the computational power of the scatter machine bounded in polynomial space. We concluded that a scatter experiment equipped with a deterministic Turing machine bounded in polynomial space is able to simulate a probabilistic Turing machine bounded in exponential time. The simulations are performed with the same tools as those introduced in Refs. [9, 26]. In order to be able to get lower and upper bounds for each type of protocol and chosen precision, we introduced the class BPPSPACE

page 93

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

94 Table 3.2. machines.

b4205-v1-ch03

Lower and upper bounds of polynomial space bounded scatter Infinite

Arbitrary

Lower Bound

PSPACE/poly

(Standard protocol)

Upper Bound

PSPACE/poly

BPPSPACE//poly

BPPSPACE//poly

ShSM

Lower Bound

2Σ

2Σ

BPPSPACE//poly

(Generalized protocol)

Upper Bound

2Σ

SmSM

Lower Bound

PSPACE/poly

BPPSPACE//poly

(Standard protocol)

Upper Bound

PSPACE/poly

BPPSPACE//poly

BPPSPACE//poly

SmSM

Lower Bound

2Σ

2Σ

BPPSPACE//poly

(Generalized protocol)

BPPSPACE//poly

Fixed

ShSM

2Σ

BPPSPACE//poly

BPPSPACE//poly BPPSPACE//poly

Upper Bound with time schedule

PSPACE/poly

BPPSPACE//poly

BPPSPACE//poly

with/time schedule

2Σ

2Σ

−−−

used as support for non-uniform classes. The results are summarized in Table 3.2. As we can observe, with respect to the fixed precision, our new class BPPSPACE//poly represents both the lower and upper bound for the computational power of the scatter machine for both types of protocols and both wedge versions. This is explained by the fact that we recur to a fixed cannon position to obtain the vertex position on both protocols. Concerning the different versions of the experiment, we have the same lower and upper bounds as for the time restriction cases, where we find that the computational power is the same for both the sharp and the smooth wedge. Having finished the exhaustive study of Turing machines with access to stochastic oracles, either bounded in time or in space, we wonder if there is any open problem left for the time bounded machines that deserves investigation. We list a few technical unanswered questions that can be of interest: (1) Although in Ref. [29], we generalized the error prone distributions to a more general setting, we still did not have a full knowledge on how processing errors in measurements might boost computation. (2) In the case of two-sided and threshold (one-sided) oracles, we would like to prove that Table 3.1 stands, even if we remove

page 94

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Space Bounded Scatter Machines

b4205-v1-ch03

95

the condition “w/exponential T” as explained in the caption. That is, we want to prove that either the upper bound falls to P/log (BPP// log in non-infinite precision) or that there is a set decided in polynomial time by one of such machines that is not in BPP// log . We believe that this can be done for the fixed precision. (3) With respect to polynomial time bounded scatter machines, we have proven upper bounds = lower bounds only when the consultation time of the oracle is a known expression. Could we prove the same upper bounds otherwise? (4) A descent chain of non-uniform classes can be constructed, considering BPP // log(ω) , where log(ω) = k∈N log(k) , where log(k) is the iterated log class of functions,i and that it can be continued even further. However, we do not know if there is a correspondence between these complexity classes and the classes decided by machines with bounded oracle calls (see Ref. [29].). Acknowledgments The research of José Félix Costa is supported by Funda¸cão para a Ciência e Tecnologia, projeto FCT I.P.:UID/FIL/00678/2013. References 1. C. Shannon, Mathematical theory of the differential analyser. J. Math. Phys. 20, 337–354 (1941). 2. H. T. Siegelmann, Neural Networks and Analog Computation: Beyond the Turing Limit (Birkh¨ auser, 1999). 3. S. Haykin, Neural Networks: A Comprehensive Foundation (MacMillan College Publishing, 1994). 4. M. Davis, The myth of hypercomputation. In C. Teuscher (ed.), Alan Turing: The Life and Legacy of a Great Thinker, (Springer, 2006), pp. 195–212. 5. M. Davis, Why there is no such discipline as hypercomputation. Appl. Math. Comput. 178(1), 4–7 (2006). i

Define log(0) (n) = n and log (k+1) (n) = log(log(k) (n)). Now we take the class of advices log (k) given by closure of each bound under addition and multiplication by positive integers and under composition with polynomials (to the right).

page 95

August 2, 2021

96

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch03

J. A. Al´ırio, J. F. Costa and L. F. Fonseca

6. E. Beggs, J. F. Costa, B. Loff, and J. V. Tucker, Computational complexity with experiments as oracles, Proc. Roy. Soci. Ser. A (Math. Phys. Eng. Sci.) 464, (2098), 2777–2801 (2008). 7. E. Beggs, J. F. Costa, B. Loff, and J. V. Tucker, Computational complexity with experiments as oracles II. Upper bounds Proc. Roy. Soci. Seri. A (Math. Phys. Eng. Sci.) 465(2105), 1453–1465 (2009). 8. E. Beggs, J. F. Costa, and J. V. Tucker, Three forms of physical measurement and their computability. The Rev. Symb. Logic 7(4), 618–646 (2014). 9. T. Ambaram, E. Beggs, J. F. Costa, D. Po¸cas, and J. V. Tucker, An analoguedigital model of computation: Turing machines with physical oracles. In A. Adamatzky (ed.), Advances in Unconventional Computing, Volume 1 (Theory), Emergence, Complexity and Computation, Vol. 22, (Springer, 2016), pp. 73–115. 10. E. Beggs, J. F. Costa, and J. V. Tucker, Limits to measurement in experiments governed by algorithms. Math. Struct. Comput. Sci. 20(06), 1019–1050 (2010), special issue on Quantum Algorithms, Editor Salvador El´ıas Venegas-Andraca. 11. E. Beggs, J. F. Costa, D. Po¸cas, and J. V. Tucker, Oracles that measure thresholds: The Turing machine and the broken balance. J. Logic Comput. 23(6), 1155–1181 (2013). 12. E. Beggs, J. F. Costa, D. Po¸cas, and J. V. Tucker, Computations with oracles that measure vanishing quantities. Math. Struct. Comput. Sci. 27(8), 1315– 1363 (2017). 13. M. Manthey, Distributed computation, the twisted isomorphism, and autopoiesis. In D. Dubois (ed.), CASYS’97 First International Conference on Computing Anticipatory Systems, Liege (Belgium), August 11–15, 1997, Department of Mathematics, University of Liege, (2014). 14. R. Carnap, Philosophical Foundations of Physics (Basic Books, 1966). 15. C. G. Hempel, Fundamentals of concept formation in empirical science, Int. Encycl. Unif. Sci. 2, 7 (1952). 16. D. H. Krantz, P. Suppes, R. D. Luce, and A. Tversky, Foundations of Measurement (Dover, 2009). 17. E. Beggs, P. Cortez, J. F. Costa, and J. V. Tucker, A hierarchy for BPP//log* based on counting calls to an oracle. In A. Adamatzky. (ed.), Emergent Computation (Festschrift for Selim Akl), Emergence, Complexity and Computation, Vol. 21, (Springer, 2016), pp. 39–56. 18. E. Beggs, J. F. Costa, D. Po¸cas, and J. V. Tucker, An analogue-digital Church-Turing thesis. Int. J. Found. Comput. Science 25(4), 373–389 (2014). 19. E. Beggs, J. F. Costa, and J. V. Tucker, Physical experiments as oracles. Bull. Eur. Assoc. Theore. Comput. Sci. 97, 137–151 (2009), an invited paper for the “Natural Computing Column”. 20. E. Beggs, J. F. Costa, B. Loff, and J. V. Tucker, On the complexity of measurement in classical physics. In M. Agrawal, D. Du, Z. Duan and A. Li, (eds.), Theory and Applications of Models of Computation (TAMC 2008), Lecture Notes in Computer Science, Vol. 4978 (Springer, 2008), pp. 20–30.

page 96

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Space Bounded Scatter Machines

b4205-v1-ch03

97

21. E. Beggs, J. F. Costa, B. Loff, and J. V. Tucker, Oracles and advice as measurements. In C. S. Calude, J. F. Costa, R. Freund, M. Oswald and G. Rozenberg (eds.), Unconventional Computation (UC 2008), Lecture Notes in Computer Science, Vol. 5204, (Springer-Verlag, 2008), pp. 33–50. 22. E. Beggs, J. F. Costa, and J. V. Tucker, Physical oracles: The Turing machine and the Wheatstone bridge. Studia Logica 95(1–2), 279–300 (2010), special issue on Contributions of Logic to the Foundations of Physics, Editors D. Aerts, S. Smets & J. P. Van Bendegem. 23. E. Beggs, J. F. Costa, and J. V. Tucker, Axiomatising physical experiments as oracles to algorithms. Philoso. Trans. Roy. Soc. Ser. A (Math. Phys. Eng. Sci.) 370 (2012). 24. E. Beggs, J. F. Costa, and J. V. Tucker, The impact of models of a physical oracle on computational power. Math. Struct. Comput. Sci. 22(5), 853–879 (2012), special issue on Computability of the Physical, Editors Cristian S. Calude and S. Barry Cooper. 25. E. Beggs, J. F. Costa, D. Po¸cas, and J. V. Tucker, A natural computation model of positive relativisation. Int. J. Unconven. Comput. 10(1–2), 111–141 (2013). 26. E. Beggs, J. F. Costa, D. Po¸cas, and J. V. Tucker, On the power of threshold measurements as oracles. In G. Mauri, A. Dennunzio, L. Manzoni and A. E. Porreca (eds.), Unconventional Computation and Natural Computation (UCNC 2013), Lecture Notes in Computer Science, Vol. 7956 (Springer, 2013), pp. 6–18. 27. E. Beggs and J. V. Tucker, Experimental computation of real numbers by Newtonian machines. Proc. Roy. Soci. Ser. A (Math. Phys. Eng. Sci.) 463(2082), 1541–1561 (2007). 28. J. L. Balc´ azar, J. D´ıas, and J. Gabarr´ o, On characterizations of the class pspace/poly. Theoret. Comput. Sci. 52, 3, 251–267 (1987). 29. E. Beggs, P. Cortez, J. F. Costa, and J. V. Tucker, Classifying the computational power of stochastic physical oracles. International Journal of Unconventional Computing 14(1), 59–90 (2018). 30. J. L. Balc´ azar, J. D´ıas, and J. Gabarr´ o, Structural Complexity I, 2nd edn. (Springer-Verlag, 1988, 1995). 31. A. S. Younger, E. Redd, and H. Siegelmann, Development of Physical SuperTuring Analog Hardware. In Unconventional Computation and Natural Computation — Proceedings of the 13th International Conference, UCNC 2014, London, ON, Canada, July 14–18, 2014 (Springer-Verlag, 2014), pp. 371–391. 32. J. F. Costa, B. Loff, and J. Mycka, A foundation for real recursive function theory. Ann. Pure Appl. Logic 160(3), 255–288 (2009). 33. K. T. Kelly, The Logic of Reliable Inquiry (Oxford University Press, 1996).

page 97

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 0004

Chapter 4

Exclusively Quantum: Computations Only Possible on a Quantum Computer Selim G. Akl School of Computing, Queen’s University, Kingston, Ontario, Canada [email protected] The property of quantum exclusivity is introduced which refers to computations that are possible to carry out only on a quantum computer, but are impossible, in principle, on a Turing machine. Exclusively quantum computations invalidate the famous conjecture known as the Church–Turing thesis, according to which any computation that can be performed on any computing device can be simulated on a Turing machine. By direct consequence, because they cannot be carried out at all, whether efficiently or inefficiently, on a Turing machine, these computations invalidate the extended Church–Turing thesis as well. These computations also demonstrate the falsity of the Principle of Universality — the assertion that there exists a Universal Computer capable of simulating any computation by any other computational device. By reviewing these results, dating mostly from the first two decades of this century, this chapter aims to establish correct historical precedence, especially in light of recent related claims of priority with respect to the violation of the Extended Church–Turing thesis by a quantum computer.

4.1. Introduction On October 23, 2019, an article appeared in the prestigious science magazine Nature entitled “Quantum Supremacy Using a Programmable Superconducting Processor”.1 In it, the authors 99

page 99

August 2, 2021

100

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

reported that their quantum computer had successfully solved a computational problem related to random number generation in 200 s. The same computation, they estimated, would require 10,000 years to perform on the fastest classical supercomputer of the day. Simultaneously, the group in charge of the said supercomputer cast doubt on the claimed quantum prowess. They stated that, according to their calculations, their machine would take 2.5 days (not 10,000 years) to complete the computation in question, in a straightforward way and without any optimization.2, 3 The debate is still raging at the time of this writing.4–12 If all of this was not enough controversy, the authors of the Nature paper concluded by asserting that their quantum computer had, for the first time, violated the extended Church–Turing thesis according to which any computation by any computer can be efficiently simulated on a Turing machine. The aim of this chapter is to set the record straight concerning previous work on the limitations of the Church–Turing thesis and, by extension, on the extended Church–Turing thesis. Indeed, the failure of the Church–Turing thesis to capture the totality of the vast and mostly uncharted expanse of the computational universe has been meticulously documented by the Parallel and Unconventional Computation Research Group at Queen’s University since 2005.13–40 In order to remain within the context created by the recent claims1 and counterclaims,2, 3 this chapter’s exposition of previous work on computations that contradict the Church–Turing thesis is limited to quantum computations. In what follows, quantum exclusivity is introduced as a more powerful property than quantum “supremacy”. Computations are described in this chapter that can be carried out on a quantum computer but are impossible, in principle, on a Turing machine. This contradicts the Church–Turing thesis, whereby any computation that can be performed on any computing device can be simulated on a Turing machine. By direct consequence, because they cannot be carried out at all, whether efficiently or inefficiently, on a Turing machine, these computations invalidate the extended Church–Turing thesis as well. A broader and more important consequence of

page 100

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

101

this result is that the Principle of Universality in computation is also invalid. It is therefore a fallacy to claim that there exists a finite and fixed Universal Computer, be it the Turing machine, the Random Access Machine, the Cellular Automaton, or any such “universal” model, capable of simulating any computation by any other computer. The presentation of exclusively quantum computations, that is, computations that are possible to carry out only on a quantum computer, is framed in the context of parallel computation,41, 42 in general, and parallelism on a quantum computer,32, 34–38, 43 in particular. There are two reasons for this, namely, (1) Parallel computation is central to each one of the described counterexamples to the Church–Turing thesis and the Principle of Universality in computation, and (2) While the information processing power of quantum computers derives in theory from their ability to execute a certain computation simultaneously on all terms of a quantum superposition, parallelism in this chapter refers to the ability to act simultaneously on multiple qubits. This chapter is organized as follows. Section 4.2 offers some background on quantum computation and quantum information. Section 4.3 reviews early attempts to use the phenomena of quantum physics in order to contradict the Church–Turing thesis. Section 4.4 introduces the class of unconventional computational paradigms that violate the Principle of Universality, and derives a theoretical proof of non-universality in computation. Section 4.5 provides detailed examples of evolving computations, that is, unconventional computational problems in which a parameter of the computation changes with the passage of time. Sections 4.6–4.8 describe five concrete evolving computations that cannot be performed on any classical computer, but are capable of being carried out successfully only on a quantum computer on which operations are executed in parallel, namely: (1) Computing the quantum Fourier transform and its inverse (rankvarying computational complexity, Section 4.6.1).

page 101

August 2, 2021

102

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

(2) Quantum decoherence in computing the quantum Fourier transform (time-varying variables, Section 4.6.5). (3) Quantum error correction (time-varying computational complexity, Section 4.7.2). (4) Measuring entangled quantum states (interacting variables, Section 4.8.1). (5) Maintaining quantum entanglement (transformations obeying a global constraint, Section 4.8.5). These computational problems demonstrate the incompleteness of the Church–Turing thesis and, as a consequence, the falsity of the Principle of universality in computation. Finally, Section 4.9 draws some important conclusions based on the evidence presented in the chapter. 4.2. Background: Parallelism and Quantum Theory The role played by parallelism in the theory of computation depends on the particular paradigm or computational environment considered, but its importance has been confirmed with the emergence of each novel computing technology. In this chapter we study the implications of parallelism in quantum information theory and show that a parallel approach can make the difference between success and failure in many computational problem cases. An unexpected consequence of this fact is the impossibility of constructing a Universal Computer, as defined herein. 4.2.1. On the importance of parallelism Parallel computing was originally motivated by the need to speed up computation, especially for those tasks whose sequential running time is prohibitively long. This traditional view of the role played by parallelism in computation has since evolved dramatically, with implications almost impossible to foresee when the field originated. We know today that there are tasks and computational paradigms for which a parallel approach offers much more than just a faster solution. A real-time environment, constraining the

page 102

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

103

input data provided and the output produced at various moments in time, can have drastic effects on the quality of the solution obtained for a certain problem, unless parallelism is employed.44, 45 A general framework is developed in Refs. [29,41,46,47] to show how a superlinear (with respect to the number of processors employed in the parallel approach) improvement in the quality of the solution computed to real-time and other computational problems can be obtained. In other cases, a sequential machine fails to tackle a certain task altogether, and parallelism is the only hope to see that task accomplished. Examples of this kind include measuring the parameters of a dynamical system48 or setting them in such a way as to avoid pushing the system into a chaotic behavior.49 Also, some geometric transformations can only be performed successfully if a certain number of objects are acted upon simultaneously.50 Of particular relevance to the present exposition are the computations possessing this characteristic and highlighted in Refs. [28, 31, 34–36]. Progress in science and technology influences the way computations are carried and the emergence of novel computational environments and paradigms continually broadens the applicability and importance of parallelism. In this chapter, we exhibit examples of problems from quantum information theory that clearly emphasize the role of parallelism in this relatively new field of computation governed by the principles of quantum mechanics. The examples we present also reinforce the argument developed in Ref. [15] demonstrating the infeasibility of a Universal Computer, that is, a computer whose parameters are defined once and for all, capable of a finite and fixed number of elementary operations per time unit, and also capable of simulating any algorithm that is executable on any other computing device. 4.2.2. Quantum computation and quantum information This section introduces the basic elements of quantum computation and quantum information to the extent needed for a clear exposition

page 103

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

104

b4205-v1-ch04

S. G. Akl

of the main ideas presented in this chapter. For a detailed survey of the field the reader is referred to Ref. [33]. Quantum information theory was developed much in analogy with classical information theory, enlarging the scope of the latter. Thus, quantum information theory deals with all the static resources and dynamical processes investigated by classical information theory, as well as additional static and dynamic elements that are specific to quantum mechanics. 4.2.2.1. The qubit Probably, the most fundamental quantum resource manipulated by quantum information theory is the quantum analogue of the classical bit, called the qubit. Though it may have various physical realizations, as a mathematical object the qubit is a unit vector in a 2D state space, for which a particular orthonormal basis, denoted by {|0, |1}, has been fixed. The basis vectors correspond to the two possible values a classical bit can take. However, unlike classical bits, a qubit can also take many other values. In general, an arbitrary qubit |ψ can be written as a linear combination of the computational basis states: |ψ = α|0 + β|1,

(4.1)

where α and β are complex numbers such that |α|2 + |β|2 = 1 (the normalization condition ensuring that |ψ is a unit vector). In order to describe the state of a qubit or ensemble of qubits in a compact way, we have adopted here the well-established bra/ket notation introduced by Dirac.51 For a single qubit |ψ, there exists a useful geometric illustration of its state as a point on a sphere. Using the polar coordinates notation to express the complex numbers α and β, |ψ = rα eiφα |0 + rβ eiφβ |1, |ψ can be represented as a unique point on a unit 3D sphere called the Bloch sphere.52, 53 Figure 4.1 depicts four possible states of a qubit using the Bloch sphere representation. Note that the states

page 104

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

Exclusively Quantum

105

|0>

1 (|0> + |1>) 2

1 (|0> - |1>) 2

|1>

Figure 4.1.

The Bloch sphere representation of a qubit.

corresponding to the points on the equatorial circle have all equal contributions of 0-ness and 1-ness. What distinguishes them is the phase. √ For example, two√of the states displayed in Figure 4.1, namely, 1/ 2(|0 + |1) and 1/ 2(|0 − |1), are the same up to a relative phase shift of π, because the |0 amplitudes are identical and the |1 amplitudes differ only by a relative phase factor of eiπ = −1. 4.2.2.2. Measurements Let us now turn our attention to the amount of information that can be stored in a qubit and, respectively, retrieved from a qubit. Since any point on the Bloch sphere can be characterized by a pair of real-valued parameters taking continuous values, it follows that, theoretically, a qubit could hold an infinite amount of information. As it turns out, however, we cannot extract more information from such a qubit than we are able to extract from a classical bit. The reason is that we have to measure the qubit in order to determine in which state it is. Yet, according to a fundamental postulate of quantum mechanics (Postulate 3 in Ref. [53]), the amount of information that can be gained about a quantum state through measurement is restricted. Thus, when we measure a qubit |ψ = α|0 + β|1 with

page 105

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

106

b4205-v1-ch04

S. G. Akl

respect to the standard basis for quantum computation {|0, |1}, we get either the result 0 with probability |α|2 , or the result 1 with probability |β|2 . Furthermore, measurement alters the state of a qubit, collapsing it from its superposition of |0 and |1 to the specific state consistent with the result of the measurement. For example, if we observe |ψ to be in state |0 through measurement, then the post-measurement state of the qubit will be |0, and any subsequent measurements (in the same basis) will yield 0 with probability 1. Naturally, measurements in bases other than the computational basis are always possible, but this will not help us in determining α and β from a single measurement. In general, measurement of a state transforms the state into one of the measuring device’s associated basis vectors. The probability that the state is measured as basis vector |u is the square of the norm of the amplitude of the component of the original state in the direction of the basis vector |u. Unless the basis is explicitly stated, we will always assume that a measurement is performed with respect to the standard basis for quantum computation. 4.2.2.3. Putting qubits together Let us examine now more complex quantum systems, composed of multiple qubits. In classical physics, individual 2D state spaces of n particles combine through the Cartesian product to form a vector space of 2n dimensions, representing the state space of the ensemble of n particles. However, this is not how a quantum system can be described in terms of its components. Quantum states combine through the tensor product to give a resulting state space of 2n dimensions, for a system of n qubits. For a system of two qubits, each with basis {|0, |1}, the resulting state space is the set of normalized vectors in the 4D space spanned by basis vectors {|0 ⊗ |0, |0 ⊗ |1, |1 ⊗ |0, |1 ⊗ |1}, where |x ⊗ |y denotes the tensor product between column vectors |x and |y. It is customary to write the basis in the more compact notation {|00, |01, |10, |11}. This generalizes in the obvious way to an n-qubit system with 2n basis vectors.

page 106

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

107

4.2.2.4. Entanglement Similar to single qubits, multiple-qubit systems can also be in a superposition state. The vector 1 1 |Φ = √ |00 + √ |11 2 2

(4.2)

describes such a superposition state for a two-qubit system. But the state |Φ has a very interesting property. It is not possible to find complex numbers α, β, γ and δ such that (α|0 + β|1) ⊗ (γ|0 + δ|1) = αγ|00 + αδ|01 + βγ|10 + βδ|11 1 1 = √ |00 + √ |11. 2 2

(4.3)

Consequently, the state of the system cannot be decomposed into a product of the states of the constituents. Even though the state of the system is well defined (through the state vector |Φ), neither of the two component qubits is in a well-defined state. This is again in contrast to classical systems, whose states can always be broken down into the individual states of their components. Furthermore, if we try to measure the two qubits, the superposition will collapse into one of the two basis vectors contributing to the superposition, and the outcomes of the two measurements will always coincide. Therefore, we say that the two qubits are entangled (a name given to the phenomenon by Schr¨ odinger54 ) and |Φ describes an entangled state of the system. Entanglement defines the strong correlations exhibited by two or more particles when they are measured, and which cannot be explained by classical means. This does not imply that entangled particles will always be observed in the same state, as entangled states like 1 1 √ |01 ± √ |10 2 2

(4.4)

page 107

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

108

b4205-v1-ch04

S. G. Akl

prove it. States like these or the one in Equation (4.2) are known as Bell states or EPR pairs after some of the people55, 56 who pointed out their strange properties. 4.3. Some History The purpose of this section is to review some of the work in quantum computation that attempted to demonstrate the inadequacy of the Church–Turing thesis. 4.3.1. Quantum versus classical computation Is a quantum computer strictly more powerful than a classical one? Are there information processing tasks for which only a machine based on quantum mechanical principles is naturally suited? What are the limitations when trying to simulate a quantum process on a classical computing machine? These questions have concerned researchers in quantum computation and quantum information theory ever since the field originated. Despite the impressive advancements made in the quantum computation and quantum information areas, the fundamental question about the relative power of a quantum computer with respect to its classical counterpart is still not fully answered. Perhaps this is partly due to the multitude of contexts (or paradigms) in which such a question might be asked. Consequently, there may not be a single answer. In this chapter, we analyze the relation between the quantum and the classical models of computation from the broad perspective offered by quantum mechanics. In this new framework established by the laws of quantum mechanics, the concept of information acquires new dimensions due to the principle of superposition of states. Quantum information is, consequently, qualitatively different from classical information, the latter becoming just a simple, particular case of the former. This generalization has two major implications. The one of most interest perhaps is that the postulates of quantum mechanics allow us to design conceptually novel tools for processing information, leading to a significant increase in computational efficiency. But this

page 108

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

109

radical departure from the classical ways of treating information also results in the formulation of novel information processing tasks. These problems involve purely quantum mechanical features and cannot be translated into classical terms. Information manipulation is still regarded today as the transformation and/or communication of classical information. This assumption is implicit and is well justified by the fact that virtually all computing devices in use today operate in accord with classical physics. The problems we can solve and the tasks we can accomplish with the aid of a computer are defined in terms of the classical input data provided to the computer. Quantum computation and quantum information, however, are challenging the current viewpoint, which may seem natural at this particular moment in time. But if a quantum computer will ever become a practical reality, this perception will necessarily have to change. By representing and manipulating information at a deeper physical level, the quantum computer can tackle problems that are impossible for a conventional computer. Although classical information about quantum systems can be acquired through quantum measurements and, in general, it is possible to simulate a quantum process on a conventional machine (even if inefficiently), there are tasks that can only be successfully accomplished if one has the ability to operate directly at the physical (quantum) level used to encapsulate information. Therefore, in the same way quantum mechanics augments the understanding of physical reality, pointing out the limitations of previous physical theories, quantum computation enlarges the scope of information processing, exposing the limitations of conventional computation. The fact that a function with an argument expressible only in quantum mechanical terms (as a superposition of states) is not computable by a conventional computing device capable to process information only at the classical level may appear as unfair to the classical computer. In fact, such functions are just revealing the natural, physical limitations of conventional computing devices. Thus, a quantum computer is intrinsically more powerful than a classical (conventional) one, as the set of problems solvable (functions computable) by the latter is a proper subset of those solvable

page 109

August 2, 2021

110

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

(computable) by the former. It is only when we restrict the input to be classical for both machines, that the relative power of a quantum computer with respect to a classical one still remains debatable. It is worth emphasizing that the gap between a quantum and a classical machine is qualitatively different from the relationship between an analog and a discrete (or digital) machine. The difference between an analog computer and a digital one is purely quantitative, in the sense that while the former can operate with real numbers (theoretically), the latter can only approximate them. But this approximation can be arbitrarily accurate, given the digital model has enough time and memory. Consequently, when presented with an analog input, the digital computer can always obtain a representation of it, which is as close to the original as the computer’s resources are permitting. Since both machines are working at the same physical level (manipulating voltage levels, for example), there is no task that one machine could do and the other would not be able to tackle it. In contrast, the operating principles of a quantum computer are qualitatively different from those governing the behavior of a conventional computer, thus allowing the former to address information processing tasks that are simply out of the scope of the latter. Non-determinism and operating on quantum superpositions can each be successfully simulated on a machine whose functioning obeys the laws of classical physics. However, we show in this chapter that there are problems merging non-determinism and entanglement (the latter seen as a particular instance of superposition involving multiple qubits) in such a way that a solution based on classical means is no longer possible. Distinguishing among entangled quantum states forms the basis for a whole class of problems requiring information manipulation that are only solvable by a machine endowed with the power of quantum computing. This demonstrates that the limitations of the classical model of computation are purely physical and a computer operating through quantum means is strictly more powerful than a conventional one. In the following section, some of the contexts in which the comparison between the classical and the quantum computer took place are reviewed and made explicit. This will help emphasize the

page 110

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

111

variety of angles under which the problem can be attacked and also put the approach adopted in this chapter into perspective. 4.3.2. A review of previous results The first step towards an analytical investigation of the computational power specific to a quantum mechanical device was the elaboration of a model that should be abstracted away from any particular physical realization. The breakthrough came when David Deutsch described the operation of a Universal Quantum Computer Q, a model of computation inspired by the classical Turing machine, but whose functioning obeys the principles of quantum mechanics. Even in this early paper,57 several features are identified with respect to which the quantum Turing machine is superior to any classical device. 4.3.2.1. True randomness The first example given is the generation of true random numbers. In particular, valid programs are shown to exist for Q that deal with arbitrary irrational probabilities, a feature that the universal Turing machine T could not truly match. It could only simulate such discrete finite stochastic systems with arbitrary accuracy, provided it has access to a “random oracle”, which really cannot be implemented by classical means. 4.3.2.2. Entangled states But the property of Q that cannot be even approximately simulated by any classical system is the generation of entangled (or nonseparable) states like 1 √ (|00 + |11). 2

(4.5)

The strong correlations exhibited by the two qubits composing state (4.5) are only characteristic to the quantum resource known as entanglement, and they are simply beyond the scope of any classical Turing machine. Bell’s theorem55 is a mathematical formulation of

page 111

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

112

b4205-v1-ch04

S. G. Akl

the fact that no classical system can reproduce the statistical results obtained by measuring these two qubits. 4.3.2.3. Quantum speed-up As another argument intended to prove the superior computational power of the quantum Turing machine, Deutsch provides an example which demonstrates how quantum parallelism can be used to speed up computation. Quantum parallelism refers to the capability of a quantum computer to evaluate a function f (x) for exponentially many different values of x in the time it takes a classical computer to evaluate the function for just one value. This is possible due to the quantum mechanical principle of the superposition of states. Deutsch exploited this feature and devised an example in which quantum parallelism augmented with interference can “beat” a classical computer. Thus, given a function f : {0, 1} → {0, 1}, he presented a quantum algorithm able to compute f (0) ⊕ f (1) in a single evaluation of the function f . Later, Deutsch’s algorithm was generalized by Deutsch and Jozsa,58 who addressed the n-bit case by allowing the domain of f to be the set of all integers in the interval [0, 2n − 1]. In just one evaluation of the function f , the Deutsch–Jozsa algorithm is able to determine whether f is constant or perfectly balanced (the latter property meaning that f maps exactly half of the input values in the domain to the image 0, and the other half to 1). Although the problem seems somewhat contrived, with no immediate practical applications, this was the first example in which the quantum computer achieved an exponential speed-up over the classical one (note that a classical Turing machine needs an exponential number of evaluations of f in order to make the decision between constant and perfectly balanced). The same superiority of the quantum computer was proved by Shor’s factorization algorithm,59 only this time for a problem of huge practical importance. Factoring large integers and computing discrete logarithms in quantum polynomial time threatens the security of a large class of public-key cryptographic systems in

page 112

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

113

use today. For a classical computer these tasks remain intractable, despite remarkable advances that could only bring their running time to a sub-exponential level.60 So, in the context of essentially speeding up the computation for some problems, it is possible to affirm that a quantum computer is definitely more powerful than a classical one. However, one should keep in mind that these problems can also be solved by the universal Turing machine, given enough time (even if this time is more than the age of the Universe). 4.3.2.4. Quantum simulations Another class of tasks at which quantum computers could naturally outperform any classical machine is simulating quantum mechanical systems occurring in Nature. As the size (number of constituents) of a quantum system increases, the number of variables required to describe the state of the system grows exponentially. So, in order to store the quantum state of a system with n distinct components, a classical computer would need some cn bits of memory, with the constant c depending upon the system being simulated and the desired accuracy of the simulation. Furthermore, calculating its evolution over time would require the manipulation of a huge matrix, involving cn × cn bits. As Feynman noted in 1982,61 this is prohibitively inefficient for a simulator observing the laws of classical physics. On the other hand, a machine that worked by quantum means would intrinsically make a much more efficient simulator, requiring only a linear number of qubits. Following the same logic, it is not difficult to envisage a classical Turing machine that simulates an arbitrary quantum circuit, if one does not care about efficiency. The simulation in Ref. [62] requires space, and therefore time, exponential in the number of qubits in the quantum circuit. Bernstein and Vazirani63 have given a simulation that takes polynomial space, but exponential time. The lack of an efficient classical simulation of a quantum computer induced the idea that a quantum computing machine may be inherently faster and therefore strictly more powerful. However, any computation a quantum computer can perform, by applying a series of unitary

page 113

August 2, 2021

114

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

evolutions to its quantum register, can be replicated (even if highly inefficiently) by a deterministic Turing machine (DTM). Similarly, a probabilistic Turing machine (PTM) can simulate the inherent probabilistic nature of a quantum measurement operation. 4.3.2.5. QTM versus DTM and PTM The contest between the quantum and the classical computer can also be judged from these two points of view: comparing the quantum Turing machine (QTM) with a DTM or a PTM. The Deutsch–Jozsa algorithm, for instance, achieves an impressive speed-up over a DTM, but the problem is also easy for a PTM, which can solve it very quickly with high probability. The first hint that QTMs might be more powerful than PTMs was given by Bernstein and Vazirani, who showed how to sample from the Fourier spectrum of any Boolean function on n bits in polynomial time on a QTM.63 No algorithm was known to replicate this result on a PTM. Then, Berthiaume and Brassard were able to construct an oracle, relative to which a decision problem exists that could be solved with certainty in polynomial time in the worst case on a quantum computer, but could not be solved classically in probabilistic expected polynomial time, if errors were not tolerated.64 In the same paper, they also show that there is a decision problem solvable in exponential time on a QTM and in double exponential time on all but finitely many instances on any DTM. These two results, besides being a victory of quantum computers over classical machines (deterministic or probabilistic) also prove that the power of quantum computation cannot simply be ascribed to the indeterminism inherent in quantum theory. 4.3.2.6. Quantum versus classical complexity The great hope for quantum computers at the inception of the quantum paradigm of computation was that they would be able to make N P -complete problems tractable. Relative to this criterion, we still don’t know whether a quantum machine is more powerful than

page 114

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

115

a classical one, in spite of Shor’s results concerning factorization and computing discrete logarithms. The trouble is that neither of these two problems is known to be N P -complete, despite the general belief that they are not in P . Furthermore, the current belief is that a quadratic improvement in the running time may be the best we can get out of a quantum computer in these kinds of tasks.65 The relative power of quantum computers with respect to classical ones can also be couched in the relationships between classical and quantum complexity classes. In this sense, the complexity classes Bounded error Probability in Polynomial (BPP) time and its quantum analogue BQP have attracted a lot of interest. Proving that BP P ⊂ BQP is regarded as proving that quantum computers are strictly more powerful than classical computers. This may be quite non-trivial to demonstrate, since BP P ⊂ BQP implies that P is not equal to P SP ACE, a result that many researchers have unsuccessfully attempted to prove. However, if a non-classical approach is adopted and the input is allowed to be described in nonclassical terms (genuine quantum mechanical terms, in our case) then we can show that the set of problems solvable efficiently by a classical computer (deterministic or probabilistic) is strictly included in the set of problems having an efficient quantum solution. 4.3.2.7. Super-Turing computations We end this exposition of working hypotheses, when comparing quantum and classical computers, with the most “exotic” cases. Some researchers have shown that there are quantum processes which can be used to compute the solution to Turing uncomputable (or undecidable) problems. Calude and Pavlov66 describe a mathematical quantum device that is able to determine with a pre-established precision whether an arbitrary program halts or not. Kieu67 uses quantum adiabatic processes to provide a single, universal procedure, taking the form of a quantum algorithm that solves Hilbert’s 10th problem (which has been shown to be equivalent to Turing’s halting problem). The essence of these results is that there exist mathematical constructions, built within the framework

page 115

August 2, 2021

116

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

provided by the physical theory of quantum mechanics, which are powerful enough to tackle with success problems that have been proved to be out of the capabilities of the Turing machine. A few observations have to be made with respect to the features empowering these quantum “hypercomputers”. They manage to compute the “uncomputable” by eluding in one way or another the finiteness condition (see Section 4.4 and Ref. [15]). The method employed by Calude and Pavlov (a quadratic form of an iterated map acting on randomly chosen vectors, the latter viewed as special trajectories of two Markov processes working in two different scales of time) encodes the whole data into an infinite superposition. Kieu too works with a dimensionally-infinite Hilbert space in his quantum adiabatic algorithm. However, he argues that the number of dimensions is only required to be sufficiently large, but finite. Furthermore, an important common characteristic of both algorithms is their probabilistic nature. The answer they give to a problem has only a certain probability to be the correct one. This probability can be made arbitrarily close to 1, but it can never reach 1 as long as the quantum procedure is only allowed to run for a finite amount of time. Finally, we note that the models of computation capable of such performances are mathematical objects with no constructive indications being offered to attempt the experimental realization of such a machine (assuming this thing is possible). From this point of view, they can rather be characterized as quantum “hypercomputers”, as opposed to a “standard” quantum computer, capable of running Shor’s algorithm, for example. Although a bit more philosophical, Roger Penrose’s claim that some quantum phenomena occurring in the neurons composing the human brain could be used to explain our conscious mentality is made along the same lines. He submits that no computational model is strong (or powerful) enough to fully explain the actions of the human brain.68, 69 In particular, no classical Turing machine is capable of simulating the complex quantum processes that are supposed to take place inside our “quantum hypercomputer” (that is, our brain).

page 116

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

117

4.4. Unconventional Computations and Non-universality Some of the computations carried out today are qualitatively different from those performed more than three-quarters of a century ago, when the age of computers was only just beginning. The traditional concept of computation is best captured by the functioning of the Turing machine. A sequence of operations (or transformations) forming the algorithm is applied to a set of input data to produce an output (or result). There are no space or time limitations, nor any restrictions imposed on the input or output data. The whole input is available at the outset (on the tape, in the case of the Turing machine) and the result is reported (placed on the tape) when the computation terminates (assuming it does). The majority of the computations performed today on various electronic computers and computing devices fit the above description. We refer to them as conventional computing paradigms and, given enough memory and time, they can be solved by any of today’s computers. The Turing machine stands as a mathematical model, an abstract prototype for any of these computing devices. However, in time, this rather simplistic view of computation has been challenged by increasingly demanding applications and realworld problems. For example, we need better solutions, faster, to problems whose input specifications may vary with time. Often, the results of certain computations need to be obtained before certain deadlines, or else various penalties can be applied. The information processing tasks we face today or those we discover to take place in Nature often possess attributes which make them unsuitable for a Turing machine. These attributes usually describe the dynamic nature of a computation, from the way the input is presented, continuing with the characteristics of the algorithm operating on the input data and ending with the possible constraints that can be imposed on the output. We call such a computation unconventional, as opposed to the general pattern exhibited by Turing computations. In this chapter we address five unconventional computing paradigms

page 117

August 2, 2021

118

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

(enumerated below), sharing the generic property that the input variables are temporally and/or spatially interconnected. Each of these paradigms is exemplified through a concrete example provided by quantum mechanics. With the advent of unconventional computing paradigms, the Principle of Universality too, needs to be revised, or at least clarified. To this end, parallelism may offer the means to show that a Universal Computer with fixed and finite physical characteristics (speed, number of processing units, etc.) cannot be built.15 An infinite hierarchy of computing devices exists, with each machine capable of simulating any one below it in the hierarchy, but none above, because it lacks the required number of processing elements necessary for coping with the degree of parallelism inherent in certain applications. In the general framework of evolving computations, whose characteristics vary during their execution, there are many paradigms for which a parallel computing approach is most appropriate, if not vital for the success of the computation.22 Here are some examples: • The computational complexity of a step in a certain computation may depend on the time when the step is executed or on the order of execution (rank) of that step within an algorithm that solves the problem at hand. • The variables upon which the algorithm is supposed to act are affected by the passage of time. • The input data are interconnected in such a way that operating on any one value inevitably disturbs the others. • At each step of the computation, a global, mathematical constraint has to be obeyed. In each of the above cases, a problem instance of size n can only be solved by a machine equipped with at least n processing units and the solution cannot possibly be simulated by another machine with fewer processors. This observation is at the heart of the impossibility of achieving universality in computing.

page 118

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

119

Formally, let U be a computational model purported to be universal. Thus, U could be the Turing machine, the random access machine, the parallel random access machine, the cellular automaton, the quantum Turing machine, or any one of the multitude of reasonable models of computation claimed to be universal. Because of its status as a putative Universal Computer, U must obey the following two properties: Property 1 (The Finiteness Condition). U can perform a finite and fixed number of elementary arithmetic or logical operations per time unit, and Property 2 (The Principle of Simulation). U can simulate any computation that is possible on any other computer. In fact, no model of computation can satisfy these two properties at the same time. As we prove in the Non-universality Theorem, no model of computation that satisfies Property 1 can satisfy Property 2. In other words, no finite and fixed model of computation can be universal. Non-universality Theorem: No computer satisfying Property 1 can be universal. Proof. Suppose that computer U can perform V elementary operations per time unit. Now let C be a computation that requires, in order to perform successfully, that W operations be executed in parallel per time unit, where W > V . Clearly, U cannot execute C (and consequently C fails). Another computer U , capable of W parallel operations per time unit, succeeds in executing C. However, U is in turn defeated by a computation C requiring W parallel operations per time unit where W > W . This continues ad infinitum, and no finite and fixed computer is universal. A direct corollary to this theorem is that no model of computation that satisfies Property 2 can satisfy Property 1. In other words,

page 119

August 2, 2021

120

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

in order for a model of computation to be universal, it must be capable of an unbounded number of operations per time unit. This non-universality result applies to all models of computation, existing or anticipated, conventional or unconventional, so long as they are defined once and for all and are capable of only a finite and fixed number of operations per time unit. The result does not impose any limits on the memory needed by the putative Universal Computer to perform a computation. Nor is there a bound on the total duration of the computation (that is, there is no deadline). Furthermore, there are no restrictions on the ability of the putative Universal Computer to communicate with the outside world for reading inputs or for writing outputs during the computation. It should be noted in passing that the Non-universality Theorem also applies to the so-called universal Turing machine, the latter being simply a Turing Machine, which is “universal” in the sense of being capable of simulating only all special-purpose Turing machines, and nothing else. In what follows, it is shown that quantum information processing provides excellent examples of evolving computing paradigms, and the need for parallelism in this newly emerged unconventional computation field transforms the Universal Computer into a myth. Here is a preview of the contents of the next few sections. Section 4.5 introduces the concept of evolving computations, and five examples of such computing paradigms are described therein. For each of these paradigms, a quantum mechanical instance is presented in the ensuing three sections. Section 4.6 shows how the procedures for computing the quantum Fourier transform and its inverse can be decomposed into steps of rank-varying computational complexity. In a practical setting, quantum decoherence places a hard deadline on when these computations have to be completed, offering a typical example of time-varying variables. Fortunately, the use of a parallel architecture can reduce the execution time and help complete the computation before the sensitive quantum information leaks into the surrounding environment. Section 4.7 is concerned with quantum error-correction schemes from the viewpoint of time-varying computational complexity. In

page 120

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

121

Section 4.8, the focus is on entanglement among qubits, first, as a quantum instance of the interacting variables paradigm. Since, technically, entanglement is a mathematical constraint imposed on the quantum state of the whole ensemble, it is then asserted that a quantum computation that has to maintain entanglement at all times belongs to the paradigm of computations obeying a global constraint. Furthermore, such a computation can only be carried out by manipulating the whole ensemble as a single entity (in other words, acting in parallel on all components). 4.5. Evolving Computations Evolution (or merely change) is a fundamental attribute of many systems that are commonly observed and investigated, whether they are physical, biological, economic, social or of any other nature. Yet, until recently, computational systems whose characteristics change during the computational process itself did not receive much attention. In this section, five computing paradigms are described, which are labeled unconventional precisely because of their dynamic nature. At an abstract level, the following generic problem needs to be solved: A set of n input variables x0 , x1 , . . . , xn−1 have to be read and a certain function F(x0 , x1 , . . . , xn−1 ) must be computed and the result reported. In the first two of the five cases to be described, the focus is on the algorithm employed to compute the function F. What evolves during the computation is the complexity of each step in the algorithm. 4.5.1. Evolving computational complexity When analyzing the computational complexity of a given algorithm, the focus is usually on how this quantity varies as a function of the problem size, without paying too much attention to how the complexity of each step in the algorithm varies throughout the computation. Though in many cases the complexity of each step is a constant, there are computations for which the cost of executing essentially similar steps is different from one step to another.

page 121

August 2, 2021

122

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

4.5.1.1. Rank-varying computational complexity One factor that can dictate the complexity of a step is its rank, defined as the order of execution of that step. For instance, if the cost of executing the ith step of an algorithm is c(i) = 2i elementary operations or time units, then the computational complexity of a step grows exponentially with its rank, for that respective algorithm. In other cases, it may be that the computational complexity of a step actually decreases with the rank. An algorithm made up of n steps for which c(i) = n − i + 1, for i = 1, . . . , n, illustrates such a situation. Examples of this kind are hardly new. Euclid’s algorithm for computing the greatest common divisor of two numbers executes the same basic operation (a division) at each step, but the size of the operands (and implicitly the complexity of the operation) decreases continually. Algorithms for which an amortized analysis can be applied also make good examples of rank-varying computational complexity. Incrementing a binary counter70 is a procedure in which the number of bit flips at each step is not constant, though it is neither strictly increasing nor strictly decreasing with the rank. 4.5.1.2. Time-varying computational complexity Alternatively, the relentless passage of time can directly influence the computational complexity of a given step in the algorithm. The difference between a rank-driven and a time-driven computational complexity can probably be synthesized best in the following manner. If the cost of executing step Sj depends only on the state of the system after executing the previous j − 1 steps, regardless of how much time was consumed to reach that state, then this is clearly an example of rank-varying computational complexity. But if the complexity of Sj is a function of the particular moment in time when that step is executed, then what we have is a procedure with steps of time-varying computational complexity. For example, if the computational complexity of Sj is described by the function t c(t) = 22 , then the computational resources required to complete

page 122

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

123

that step are rapidly growing with the moment in time when Sj is actually executed. 4.5.2. Evolving computational variables For the three remaining examples of unconventional computing paradigms, the focus moves from the algorithm to the input variables x0 , x1 , . . . , xn−1 , which now determine the dynamics of the system. 4.5.2.1. Time-varying variables In the paradigm dealing with time-varying variables, time plays again the main role. Each argument of function F is itself a function of time: x0 (t), x1 (t), . . ., xn−1 (t). At each time unit, the values assumed by the input variables change in such a way that the new value cannot be predicted from the former, nor the former recovered from the latter. Certainly, this makes the computation of F(x0 (t0 ), . . . , xn−1 (t0 )) at the precise moment t = t0 a challenging task, in case we do not have the capability of reading all n input variables, in parallel, at the right moment. 4.5.2.2. Interacting variables But even if the input variables are not affected by the passage of time, the computational environment may still change during the computation. In the next paradigm described here, it is the interactions among mutually dependent variables, caused by an interfering agent (performing the computation) that is the origin of the evolution of the system under consideration. Thus, a relationship exists between x0 , x1 , . . . , xn−1 that connects them together. Any attempt to read the value of any one variable will inevitably and unpredictably disturb the values of the remaining variables. More precisely, the act of reading xi , for any i ∈ {0, 1, . . . , n − 1}, causes the system to make an irreversible transition from state (x0 , x1 , . . . , xi , . . . , xn−1 ) to (x0 , x1 , . . . , xi , . . . , xn−1 ). In this way, some of the values needed in the computation of F may be lost

page 123

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

124

without possibility of recovery. This is the hallmark of the interacting variables paradigm. 4.5.2.3. Computations obeying a global constraint Finally, the relationship among the input variables may take the form of a global property P(x0 , x1 , . . . , xn−1 ) that characterizes the initial state of the system and which must be maintained throughout the computation. In particular, if the effect of the computation is to change xi to xi at some point, then P(x0 , x1 , . . . , xi , . . . , xn−1 ) must be true for the new state of the system. If the property P is not satisfied at a given moment of the computation, the latter is considered to have failed. As the following sections prove it, each of these five unconventional paradigms of computation admits a quantum mechanical instance that requires a parallel approach for a successful outcome. 4.6. Quantum Fourier Transform The Fourier transform is a very useful tool in computer science and it proved of crucial importance for quantum computation as well. Since it can be computed much faster on a quantum computer than on a classical one, the discrete Fourier transform allows for the construction of a whole class of fast quantum algorithms. Shor’s quantum algorithms for factoring integers and computing discrete logarithms59 are the most famous examples in this category. The quantum Fourier transform is a linear operator whose action on any of the computational basis vectors |0, |1, . . . , |2n − 1 associated with an n-qubit register is described by the following transformation: n

2 −1 1 2πijk/2n e |k, |j −→ √ 2n k=0

0 ≤ j ≤ 2n − 1.

(4.6)

However, the essential advantage of quantum computation over classical computation is that the quantum mechanical principle of superposition of states allows all possible inputs to be processed

page 124

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

Exclusively Quantum

125

at the same time. Consequently, if the quantum register is in an arbitrary superposition of the basis vectors n −1 2

xj |j,

j=0

then the quantum Fourier transform will rotate this state into another superposition of the basis vectors n −1 2

yk |k,

k=0

in which the output amplitudes yk represent the discrete Fourier transform of the input amplitudes xj . Classically, we can compute the numbers yk from xj using Θ(22n ) elementary arithmetic operations in a straightforward manner and in Θ(n2n ) operations by using the Fast Fourier Transform algorithm. In contrast, a circuit implementing the quantum Fourier transform requires only Θ(n2 ) elementary quantum gates. Such a circuit can be easily derived if equation (4.6) is rewritten as a tensor product of the n qubits involved:

|j1 j2 · · · jn −→

(|0 + e2πi0.jn |1) ⊗ (|0 + e2πi0.jn−1 jn |1) ⊗ · · · ⊗ (|0 + e2πi0.j1 j2 ···jn |1) 2n/2

.

(4.7) using the binary representation j1 j2 · · · jn of j and binary fractions in the exponents (for full details see Ref. [53]). Note that each Fourier transformed qubit is in a balanced superposition of |0 and |1. These qubits differ from one another only in the relative phase between the |0 and the |1 components. For the first qubit in the tensor product, jn will introduce a phase shift of 0 or π, depending on whether its value is 0 or 1, respectively. The phase of the second qubit is determined (controlled) by both jn and jn−1 . It can amount to π + π/2, provided jn−1 and jn are both 1. This dependency on the values of all the previous qubits continues up to (and including) the last term in the tensor product. When |j1

page 125

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

126

Figure 4.2.

Quantum circuit performing the discrete Fourier transform.

gets Fourier transformed, the coefficient of |1 in the superposition involves all the digits in the binary expansion of j. In the case of each qubit, the 0 or π phase induced by its own binary value is implemented through a Hadamard gate. The dependency on the previous qubits is reflected in the use of controlled phase shifts, as depicted in Figure 4.2. In the figure, H denotes the Hadamard transformation 1 1 1 , H≡√ 2 1 −1 while the gate Rk implements a π/2k−1 phase shift of the |1 component, according to the unitary transformation 1 0 . Rk ≡ k 0 e2πi/2 4.6.1. Rank-varying complexity Computing the quantum Fourier transform and its inverse can also be seen as examples of algorithms with rank-varying complexity. According to the quantum circuit in Figure 4.2, we need n Hadamard gates and (n − 1) + (n − 2) + · · · + 1 conditional rotations, for a total of n(n + 1)/2 gates required to compute the Fourier transform on n qubits. But this total amount of work is not evenly distributed over the n qubits. The number of gates a qubit needs to be passed through is in inverse relation with its rank. Thus, |j1 is subjected to n elementary quantum gates, n−1 elementary unitary transformations

page 126

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

Exclusively Quantum

127

are applied to |j2 , and so on, until |jn , which needs only one basic operation. If we break down the quantum Fourier transform algorithm into n steps (one for each qubit involved), then its complexity varies with each step. Starting with |j1 , the time needed to complete each step decreases over time. Since the rank of each step dictates its complexity, the circuit implementing the quantum Fourier transform is an example of a rank-varying complexity algorithm. Naturally, the computation of the inverse quantum Fourier transform can also be decomposed into steps of varying complexity. Reversing each gate in Figure 4.2 gives us an efficient quantum circuit (depicted in Figure 4.3) for performing the inverse Fourier transform. Note that the Hadamard gate is its own inverse and Rk† denotes the conjugate transpose of Rk : 1 0 † Rk ≡ k . 0 e−2πi/2 Getting back to the original |j1 j2 · · · jn from its Fourier transformed expression has a certain particularity however. Because of the interdependencies introduced by the controlled rotations, the procedure must start by computing |jn and then work its way up to |j1 . The value of |jn is needed in the computation of |jn−1 . Both |jn and |jn−1 are required in order to obtain |jn−2 . Finally, the value of all the higher rank bits are used to determine |j1 precisely. Thus, computing the inverse Fourier transform by the quantum circuit illustrated in Figure 4.3 is a procedure the complexity of whose steps increases with their rank.

Figure 4.3.

Quantum circuit performing the inverse Fourier transform.

page 127

August 2, 2021

128

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

Can a parallel approach be employed in order to counter this variation in complexity and make all steps take a constant amount of time to execute? In the case of the quantum Fourier transform, it is interesting to note that strict sequentiality is enforced by the laws of quantum mechanics, but we still have a chance to speed up the computation, provided we restrict either the input or the output to be classical. No parallel algorithm exists in the general case, when an arbitrary superposition of the basis vectors is Fourier transformed and we are not allowed to measure the output. The reason for this impossibility is the quantum mechanical nature of the qubits controlling the phase shifts in Figures 4.2 and 4.3. Such a controlled rotation corresponds to a two-qubit gate and we need to apply, in parallel, a number of two-qubit gates, where the control qubit is the same in all gates. Since we cannot gain knowledge of the control qubit’s state through measurement and cloning an unknown quantum bit is forbidden by the laws of quantum mechanics, any attempt to parallelize the procedure in the general case is doomed to failure. However, if the Fourier transform step comes right before measuring in a quantum algorithm, then it is possible to devise a parallel solution that can reduce the total running time. This is not too much of a constraint though, since virtually all quantum algorithms using some form of Fourier transform in order to create an interference among the multiple computational paths proceed to follow it with a measurement of the quantum register. In particular, this is also true in the case of Shor’s quantum algorithms for factoring integers and computing discrete logarithms. Before presenting the details of the parallel architecture designed to compute the quantum Fourier transform, the most efficient sequential way of performing the same computation is first described, under the assumption that the output is measured (and is, therefore, classical). 4.6.2. Semi-classical solution Although the circuits for computing the quantum Fourier transform and its inverse are efficient in terms of the total number of gates employed, the majority of these gates operate on two qubits. This

page 128

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

Exclusively Quantum

129

makes a practical implementation difficult, since arranging for one qubit to influence another in a desired way is far greater a challenge than evolving a single-qubit closed quantum system in accordance with any unitary transformation. A method to replace all the two-qubit gates in the circuit performing the quantum Fourier transform by a smaller number of one-qubit gates controlled by classical signals has been developed by Griffiths and Niu.72 Their approach takes advantage of the fact that the roles of the control and target qubits in any of the two-qubit gates required to carry on the computation of the quantum Fourier transform are interchangeable. Consequently, the quantum circuit in Figure 4.2 is equivalent to the one depicted in Figure 4.4 (for inputs restricted to four qubits). Note that, from this new perspective, the computation of the quantum Fourier transform appears to be a procedure whose steps are of increasing complexity. However, under the assumption that the Fourier transform is immediately followed by a quantum measurement, the complexity of each step in the computation can be made constant. Since a control qubit enters and leaves a two-qubit gate unchanged, it follows that the top qubit in Figure 4.4 yields the same result regardless of whether it is measured as it exits the circuit or immediately after undergoing the Hadamard transform. In the latter case, the result of the measurement can be used to determine the phase shift that needs to be applied on the second qubit, before it too is subjected to a Hadamard transform and then measured. The phase computed for the second qubit together with

j1 j2 j3 j4

H

k4

R2

H

k3

R3

R2

H

k2

R4

R3

R2

H

k1

Figure 4.4. Alternative arrangement of gates in the circuit performing the quantum Fourier transform.

page 129

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

S. G. Akl

130 j1

b4205-v1-ch04

H

j2 j3

k4

R2 H

k3

R3 R2 H R4 R3 R2 H

j4

Figure 4.5.

k2 k1

Semiclassical circuit for computing the quantum Fourier transform.

the result of the second measurement are passed down as classical inputs for the rotation applied to the third qubit. The computation proceeds in this manner all the way down to the last qubit, with a phase rotation, a Hadamard gate and a measurement being performed at each step. The process is illustrated in Figure 4.5, where double lines have been used to denote a classical signal, according to the usual convention. Although the phase shift applied to each qubit is considered a single operation, conceptually, it is a combination of the gates depicted in the corresponding box, with each component being applied only if the controlling qubit was measured as 1. This semi-classical approach to computing the quantum Fourier transform achieves optimality in terms of the number of elementary unitary transformations that have to be applied. It also has the important advantage of employing only quantum transformations acting on a single qubit at a time. However, there is still room for improvement, as the total time needed to complete the computation can be further squeezed down if parallelism is brought into play. In what follows, it is shown how a quantum pipeline architecture is able to speed up the computation of the Fourier transform.71 4.6.3. Parallel approach The solution developed in Ref. [72] to reduce the complexity of the quantum Fourier transform envisages a purely sequential approach, which is motivated by the same data dependency that causes the complexity of a step to vary with its rank. Nevertheless, there

page 130

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

H

j2

R2

j3

R3

>

j4

R4

>

Measure

131 >

k1 , k2 , k3 , k4

>

Figure 4.6.

j1

>

Exclusively Quantum

Control

Quantum pipeline array for computing the Fourier transform.

is a certain degree of parallelism that is worth exploiting in the computation of the quantum Fourier transform (or its inverse) in order to minimize the overall running time. The parallel approach is based on the observation that once a qubit has been measured, all phase shift gates classically controlled by the outcome of that measurement can be applied in parallel. The arrangement, again for just four qubits, is shown in Figure 4.6. The one-qubit gates are ordered into a linear array having a Hadamard transform at the top and followed by a π/2 phase shift gate. The phase shift induced by any other gate down the array is just half the rotation performed by the immediately preceding gate. This architecture allows R2 , R3 and R4 to be performed in parallel during the first cycle. Since each phase shift gate acts on a different qubit, they can all be applied simultaneously, if the top qubit yielded a 1 upon measurement. In the second cycle, each qubit in the array travels up one position, except of course for the top one, which has already been measured. Now, depending on the outcome of the second measurement, R2 and R3 can be simultaneously effected on the corresponding qubits. In the third cycle, only R2 is needed and only if the control is 1. The computation ends with the last qubit reaching the Hadamard gate and being measured afterwards. A formal description of the procedure, in the general case, is given in what follows.

page 131

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

132

Procedure Parallel Quantum Fourier Transform Input: |j1 j2 · · · jn Output: k1 k2 · · · kn for i = 1 to n do |ji ←− H|ji ; Measure |ji as kn−i+1 ; if kn−i+1 = 1 then for l = 2 to n − i + 1 do in parallel |ji+l−1 ←− Rl |ji+l−1 ; |ji+l−1 moves one position up in the array endfor endif endfor In the worst case, when all qubits are measured as 1, there is no difference between the parallel algorithm outlined above and the sequential solution envisaged by Griffiths and Niu72 with respect to the overall running time. Assuming, for analysis purposes, that measuring a qubit, applying a phase shift, and performing a Hadamard transformation, each takes one time unit, then the total time necessary to complete the Fourier transform on a quantum register with n qubits is 3n−1, as the top qubit in both the sequential circuit of Figure 4.5 and the parallel circuit of Figure 4.6 does not require a phase shift. This analysis only considers the quantum operations that need to be performed. The sequential method also requires some classical computation, when the phase shift that is to be applied to each qubit is calculated. However, in the average case, some of the classical signals controlling the array of phase shift gates in Figure 4.6 will have been observed as 0, meaning that no phase shifts have to be performed during those respective cycles. In contrast, the sequential solution depicted in Figure 4.5 requires the application of a phase shift at every step following the first measurement with outcome 1. If the

page 132

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

133

expected probability of a measurement yielding 0 equals the expected probability to observe a 1 following a measurement, then the running time of the parallel solution is shorter than the sequential running time by a difference proportional to the time it takes to effect a number of O(n) phase shift gates, where n is the size of the input register. The difference between the sequential running time and the parallel running time is maximum when |j1 is measured as 1 and all the other qubits are observed in the state 0. In this case, the circuit in Figure 4.5 still performs n − 1 phase shifts, for a total running time of 3n − 1 time units, while the circuit in Figure 4.6 executes all n − 1 phase shifts in parallel during the first cycle, thus completing the computation in 2n + 1 time units. The second advantage of the parallel approach is that the phase shift gates that need to be applied during the computation are known at the outset, making it easy to set them up beforehand in order to form the required linear array architecture. The systolic mode of operation of the quantum array compensates for the fixed characteristics of each gate, the qubits traversing the array to undergo a specific quantum evolution at each node. In the sequential approach, the phase shift applied to each qubit is not known at the outset, as it is computed on the fly based on the information about the measurements performed so far and transmitted as classical signals. This means that the gates effecting the necessary phase shifts in the semi-classical approach of Griffiths and Niu72 have to be “programmed” or adjusted during the computation, in order to accommodate a discrete set of possible values for the phase shift. The semi-classical Fourier transform and its parallelization are applicable to those quantum computations in which the Fourier transform immediately precedes a measurement of the qubits involved in the computation. Furthermore, the quantum systolic array architecture works equally fine if the input is already classical, in which case the restriction to measure the qubits after applying the Fourier transform can be lifted altogether. When j1 , j2 , . . . , jn are classical bits, the topology of the circuit in Figure 4.6 remains unchanged, except that no measurements are

page 133

August 2, 2021

134

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

Figure 4.7. Quantum pipeline array for computing the Fourier transform on classical inputs.

performed and the flow of data through the linear array is reversed, as shown in Figure 4.7. As more data are fed into the linear array through the Hadamard gate, after having “controlled” the parallel execution of a set of phase shifts, the computational complexity of each step increases with its rank. When j1 enters the array, only the Hadamard gate is active, but with each consecutive step, a new gate down the array joins the ones above it to operate on the qubits traversing the array. Because these gates operate in parallel, the execution time of each step is maintained constant. Also note that, in this case, all outputs are simultaneously obtained during the last step of the computation. The overall parallel running time, in the worst case, is therefore 2n − 1 time units, as there are no measurements to perform. In most cases, however, the parallel running time is smaller than the time needed to complete the computation in a purely sequential manner, where each qubit is dealt with one after the other, in decreasing order of their ranks. Although applying the quantum Fourier transform on a classical input is of little value for quantum computing, the situation is different for quantum cryptography. Distributing classical keys through quantum means is a procedure that may use the quantum Fourier transform and its inverse as encoding and decoding algorithms to protect vital information while in transit.73–75 Naturally, the

page 134

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

135

parallel approach employed for the computation of the direct Fourier transform is also applicable, with the same results, to the circuit in Figure 4.3, performing the inverse Fourier transform. The difficulty of devising a parallel algorithm for computing the quantum Fourier transform comes from the data dependency between the different steps of the procedure. In most cases, it is exactly this precedence among the steps composing an algorithm that determines the variation in complexity. As a consequence, it is not easy, in general, to design a parallel solution to a problem whose steps are characterized by rank-varying complexity. The data dependency may impose a strict order of execution, making the resulting algorithm inherently sequential (think about Euclid’s algorithm again). But, there is also a positive aspect of the data dependency characterizing the quantum Fourier transform. It may be exploited in cryptographic applications, for example to increase the security and intrusion detection rate in quantum key distribution protocols.73–75 On the other hand, perhaps there exist computations made up of steps of various rank-dependent complexities, for which the order of execution is of no consequence to the correctness of the computation. Imagine, for instance, a task made up of n steps: S1 , S2 , . . . , Sn , where the steps can be executed in any order, but the more steps we execute before a certain step Sj (1 ≤ j ≤ n), the more time it will take to complete Sj . For example, Sj may require i elementary operations (time units) if executed ith, for i = 1, 2, . . . , n. This rank-driven increase in complexity may be due to how many pieces of data have to be taken into consideration at each consecutive step, how the data were affected by executing the previous steps, or it may be justified by the size of the partial solution that has to be constructed at each step. In any case, the problem of coping with steps of ever increasing complexity is avoided altogether by a parallel machine endowed with n processing units. All steps would then be executed simultaneously, and since each step has the rank 1 in such a parallel approach, the computational complexity is kept constant (one time unit, in our example) for all steps. The difference between a sequential and a parallel approach is even more dramatic if the complexity of a step

page 135

August 2, 2021

136

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

grows faster with its rank. In the case where step Sj (1 ≤ j ≤ n) needs 2i elementary operations (time units) to be completed, if executed ith, i = 1, 2, . . . , n, the benefits of using a parallel approach are much higher. From the research viewpoint adopted in this chapter, it remains an open problem to investigate if there are quantum instances belonging to the rank-varying computational complexity paradigm for which there is no pre-determined order of execution of the steps composing the algorithm. Perhaps the reversible nature of quantum evolutions may play some role, in the sense that when a step is executed, it must first undo the transformations performed by all previously executed steps. The difference in time complexity between the sequential approach and the parallel one, in the computation of the direct or inverse quantum Fourier transform, may seem insignificant from a theoretical perspective, but it proves essential under practical considerations, as is shown next. 4.6.4. Quantum decoherence Qubits are fragile entities and one of the major challenges in building a practical quantum computer is to find a physical realization that would allow us to complete a computation before the quantum states we are working with become seriously affected by quantum errors. In an ideal setting, we evolve our qubits in perfect isolation from the outside world. But any practical implementation of a quantum computation will be affected by the interactions taking place between our system and the environment. These interactions cause quantum information to leak out into the environment, leading to errors in our qubits. Different types of errors may affect an ongoing computation in different ways, but quantum decoherence, as defined below, usually occurs extremely rapidly and can seriously interfere with computing the quantum Fourier transform and its inverse. In the context of a quantum key distribution protocol,73–75 consider the task of recovering the original (classical) bit string j = j1 j2 . . . jn from its quantum Fourier transformed form. The circuit performing this computation (see Figure 4.3) takes as input n

page 136

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

137

qubits. The state of each qubit can be described by the following general equation: eiθk 1 |ψk = √ |0 + √ |1, 2 2

1 ≤ k ≤ n,

(4.8)

where the relative phase θk , characterizing the qubit of rank k, depends on the values of bits jk , jk+1 , . . . , jn . The corresponding density operator is given by e−iθk eiθk 1 1 |0 1| + |1 0| + |1 1|, (4.9) ρk = |ψk ψk | = |0 0| + 2 2 2 2 or in matrix form ρk =

1 1 e−iθk . 2 eiθk 1

(4.10)

The diagonal elements (or the populations) measure the probabilities that the qubit is in state |0 or |1, while the off-diagonal components (the coherences) measure the amount of interference between |0 and |1.76 Decoherence then, resulting from interactions with the environment, causes the off-diagonal elements to disappear. Since that is where the whole information carried by a qubit is stored, the input qubits for computing the inverse Fourier transform are very sensitive to decoherence. When they become entangled with the environment, the interference brought about by the Hadamard gate is no longer possible, as the system becomes effectively a statistical mixture. In other words, decoherence makes a quantum system behave like a classical one. Naturally, this process is not instantaneous, but it usually occurs extremely rapidly, subject to how well a qubit can be isolated from its environment in a particular physical realization. Because of decoherence, we must obtain the values of j1 , j2 , . . . , jn before time limit δ, after which the errors introduced by the coupling with the environment are too serious to still allow the recovery of the binary digits of j. The precise value of δ will certainly depend on the particular way chosen to embody quantum information, but if δ lies between the parallel completion time and the sequential completion time, then

page 137

August 2, 2021

138

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

the quantum pipeline array may be the only architecture capable to precisely recover all digits in the binary expansion of j. From a different perspective, the parallel solution allows for longer bit strings to be transmitted between the communicating parties, thus achieving better scalability over the purely sequential approach. Griffiths and Niu72 also point to decoherence as a possible problem when discussing their semiclassical solution to computing the quantum Fourier transform, and suggest to counter it by arranging the computation in such a way that the more significant bits of the input register are produced earlier than the less significant ones. This may or may not be possible, depending on the particular characteristics of a certain application. In the case of the inverse Fourier transform used as a decoding method in a quantum key distribution protocol,73–75 starting to work early on higher rank qubits is not possible because the rank of each qubit is not disclosed until the second stage of the protocol, when all qubits are available to the receiving party. In such a situation, the parallel approach previously described can make the difference between success and failure when computing the quantum Fourier transform or its inverse in a practical setting. Alternatively, since the overall running time scales up with the number of input qubits, parallelism may be a way to improve scalability and still complete the computation before decoherence effects take hold. In this context, it is essential to emphasize that scalability and decoherence are the two most important issues in designing a practical quantum computer. 4.6.5. Time-varying variables In this section, it was seen that the computation of the Fourier transform by quantum means belongs to the class of computations in which the complexity of each step depends on its rank. In addition, if we also take into consideration the properties of the computational environment, we are faced with the negative effects caused by quantum decoherence. Formally, the data stored in the quantum register before time limit δ is significantly different from what the same qubits encode after the decoherence threshold δ. The coupling

page 138

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

139

between our qubits and their surrounding environment effectively places a hard deadline on the computation. After this deadline, the input data (variables) will have changed and if the computation is not yet complete, it has inevitably failed. From this perspective, the computation of the quantum Fourier transform (whether direct or inverse) in the presence of decoherence is an example of the paradigm dealing with time-varying variables. As was demonstrated above, parallelism can help us cope with variables whose values change over time. The use of a parallel approach becomes critical when the solution to a certain problem must accommodate a deadline. In our case, quantum decoherence places an upper bound on the scalability of computing the quantum Fourier transform or its inverse, and the only chance to reach beyond that limit is through a parallel solution. 4.7. Quantum Error-correction In the examples presented in the previous section, the complexity of each step evolves with its rank. The more steps are executed before the current one, the higher the computational resources required to complete it. In this section, the focus remains on steps of variable complexity, but in this case the variation is time driven rather than rank driven. In other words, we can have a high computational complexity even for the first step, if we allow some time to pass before starting the computation. The amount of computational resources required to successfully carry out a certain step are directly proportional with the amount of time elapsed since the beginning of the computation. This paradigm is illustrated through the use of error-correcting codes employed to maintain a quantum computation error-free. The laws of quantum mechanics prevent, in general, a direct application of the classical error-correction techniques. We cannot inspect (measure) at leisure the state of a quantum memory register to check whether an ongoing computation is not off track without the risk of altering the intended course of the computation. Moreover, because of the no-cloning theorem, quantum information cannot be

page 139

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

140

b4205-v1-ch04

S. G. Akl

amplified in the same way digital signals can. Correcting quantum errors certainly requires much more ingenuity than fixing classical bits, but the basic idea of using redundancy is still useful. Like in the classical case, the information contained in a qubit is spread out over several qubits so that damage to any one of them will not influence the outcome of the computation. In the quantum case, though, the encoding of the logical qubit is achieved through the use of specific resources, by entangling the logical qubit with several ancilla qubits. In this way, the information in the state of the qubit to be protected is spread among the correlations characterizing an entangled state. Paradoxically enough, entanglement with the environment can be fought back using quantum error-correcting codes based on entanglement.77 4.7.1. Quantum codes The construction of all quantum error-correcting codes is based on the surprising, yet beautiful idea of digitizing the errors. How can quantum errors be digitized when, as the variables they affect, they form a continuum? The answer lies in the linear nature of quantum mechanics. Any possible error affecting a single qubit can be expressed as a linear combination of no errors (I), bit flip errors (X), phase errors (Z) and bit flip phase errors (Y ), where I, X, Z and Y are the Pauli operators describing the effect of the respective errors. Generalizing to the case of a quantum register, an error can be written as i ei Ei for some error operators Ei and coefficients ei . The error operators can be tensor products of the single-bit error transformations or more general multibit transformations. An error correcting code that can undo the effect of any error belonging to a set of correctable errors Ei will embed n data qubits (logical qubits) in n+k code qubits (physical qubits). The joint state of the ensemble of code qubits is subject to an arbitrary error, mathematically expressed as a linear combination of the correctable error operators Ei . To recover the original encoded state, a syndrome extraction operator has to be applied that uses some ancilla qubits to create a superposition of the error indices i corresponding to those correctable error operators Ei that have transformed the encoded state.

page 140

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

141

Measuring only the ancilla qubits will collapse the superposition of errors, yielding only one index k. But because the ancilla qubits were entangled with the code qubits through the application of the syndrome extraction operator, the side effect of the measurement is that the corruption caused by all error transformations will be undone, save for the one corresponding to index k. Consequently, only one inverse error transformation is required in order to complete the recovery process. In essence, knowing how to deal with a set of fundamental error transformations allows us to tackle any linear combination of them by projecting it to one of the basis components. This process is referred to as digitizing or discretizing the errors. Peter Shor’s second major contribution to the advancement of quantum computation was the creation in 1995 of an algorithm that could correct any kind of error (amplitude and/or phase errors) affecting a single qubit in a 9-qubit code.78 In a different approach, Steane studied the interference properties of multiple particle entangled states and managed to devise a shorter, 7-qubit code.79 The number of qubits necessary for a perfect recovery from a single error was later squeezed down to a minimum of five.80, 81 Naturally, in order to cope with more than one error at a time, it is necessary to use larger and more elaborate codes. The book of Nielsen and Chuang53 offers a detailed treatment of quantum codes, explaining how ideas from classical linear codes can be used to construct large classes of quantum codes, as the Calderbank– Shor–Steane (CSS) codes,82, 83 or the stabilizer codes (also known as additive quantum codes), which are even more general than the CSS codes and are based on the stabilizer formalism developed by Gottesman.84 The major drawback in using large and intricate quantum codes is that the corrective circuit itself is as much prone to errors as the quantum circuit responsible for the main computation. The more errors we are attempting to rectify, the more the complexity and length of the recovery procedure will increase (see Ref. [85] for some theoretical bounds on the relationship between the number of data qubits, the total number of entangled qubits and the maximal number of errors that can be tolerated). Thus, we can only increase

page 141

August 2, 2021

142

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

the size of the error correction codes up to a certain cutoff point, past which no further gains in accuracy can be made. One attempt to overcome this limitation are the concatenated codes. If a certain code uses n physical qubits to encode one logical qubit, a concatenated version of that code is obtained by further encoding each of the n qubits in another block of n. This hierarchical structure (tree) can be further expanded to accommodate as many levels as desired. By adding more levels of concatenation, the overall chance for an error can be made arbitrarily small, provided that the probability of an individual error is kept below a certain critical threshold.86 Of course, the high cost of using concatenated codes lies in the exponential increase in the number of qubits with the number of levels added. 4.7.2. Time-varying complexity This short exposition of the various quantum error-correcting codes devised to maintain the coherence of fragile quantum states and to protect them from dissipative errors caused by spontaneous emissions, for example, clearly shows one thing. The more time it takes to complete a quantum computation, the more errors are introduced in the process, and consequently, the more time, number of ancilla qubits and higher complexity error-correcting schemes that need to be employed. Correcting quantum errors is an important task executed alongside the mainstream computation and its complexity is heavily dependent on time. Steps executed soon after the initialization of the quantum register will require none or low complexity recovery techniques, while steps executed long after the initialization time may require complicated schemes and heavy resources allocated to deal with quantum errors. As with the other paradigms investigated in this chapter, here too parallelism can help avoid this increase in the complexity of the recovery procedure and ultimately ensure the success of the computation. If the steps of the algorithm are independent of one another and can be executed in any order, then the most straightforward application

page 142

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

143

of parallelism is to execute all steps simultaneously and thus complete the computation before any serious errors can accumulate over time. In this way we try to avoid or elude quantum errors rather than deal with them. But parallelism, in the form of redundancy, can also be used to correct quantum errors. 4.7.3. Error correction via symmetrization The technique called error correction via symmetrization 87, 88 is yet another example of how the duality of quantum-mechanical laws can be exploited for the benefit of quantum computation. Although the measurement postulate severely restricts us in recycling techniques from classical error correction, it can still offer conceptually new ways of achieving error correction that are simply unavailable to classical computers. Error correction via symmetrization relies on the projective effect of measurements to do the job. The technique uses n quantum computers, each performing the same computation. Provided no errors occur, the joint state of the n computers is a symmetric one, lying somewhere in the small symmetric subspace of the entire possible Hilbert space. Devising a clever measurement that projects the joint state back into the symmetric subspace should be able to undo possible errors, without even knowing what the error is. To achieve this, the n quantum computers need to be carefully entangled with a set of ancilla qubits placed in a superposition representing all possible permutations of n objects. In this way, the computation can be performed over all permutations of the computers simultaneously. Then, by measuring the ancilla qubits, the joint state of the n computers can be projected back into just the symmetric computational subspace, without the errors being measured explicitly. Peres has shown that this technique is most appropriate for correcting several qubits that are slightly wrong, rather than correcting a single qubit that is terribly wrong.89 Error correction via symmetrization can be applied repeatedly, at regular time intervals, to avoid the accumulation of large errors and continually project the computation back into its symmetric subspace.

page 143

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

144

b4205-v1-ch04

S. G. Akl

No matter which parallel approach is employed, if the required number of quantum processing units is provided, then the algorithm is successful. Simulating the same solution on an insufficient number of quantum computers will lead to a gradual accumulation of the quantum errors up to the point where the results of the computation are compromised. 4.8. Entanglement Revisited In this section, the focus moves to the most counterintuitive property exhibited by quantum particles, namely entanglement. As described in Sections 4.2.2.4 and 4.3.2.2, the components of a quantum system are said to be entangled, if the state of the ensemble cannot be broken down or decomposed into the states of the constituents. Although the state of the system as a whole is well defined, neither of its components is in a well-defined state. Entanglement is responsible for the strong correlations exhibited by two or more particles when they are measured, and which cannot be explained by classical means. At an abstract level, entanglement among qubits can be described as the behavior exhibited by a set of interacting variables. When such a variable is subjected to a measurement, the process has consequences on the other variables in the set, as well. 4.8.1. Interacting variables Formally, suppose there are n variables x0 , x1 , . . . , xn−1 . Although these variables may represent the parameters of a physical or biological system, the following formalism is abstracted away from any particular realization and does not necessarily describe the dynamics of a quantum system. The dependence of each variable on all others induces the system to continually evolve until a state of equilibrium may eventually be reached. In the absence of any external perturbations, the system can remain in a stable state indefinitely. We can model the interdependence between the n variables through

page 144

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

Exclusively Quantum

145

a set of functions, as follows: x0 (t + 1) = f0 (x0 (t), x1 (t), . . . , xn−1 (t)) x1 (t + 1) = f1 (x0 (t), x1 (t), . . . , xn−1 (t)) .. .

(4.11)

xn−1 (t + 1) = fn−1 (x0 (t), x1 (t), . . . , xn−1 (t)). This system of equations describes the evolution of the system from state (x0 (t), x1 (t), . . . , xn−1 (t)) to state (x0 (t + 1), x1 (t + 1), . . . , xn−1 (t + 1)), one time unit later. In the case where the system has reached equilibrium, its parameters will not change over time. It is important to emphasize that, in most cases, the dynamics of the system are very complex, so the mathematical description of functions f0 , f1 , . . . , fn−1 is either not known to us or we only have rough approximations for them. Assuming the system is in an equilibrium state, our task is to measure its parameters in order to compute a function F, possibly a global property of the system at equilibrium. In other words, we need the values of x0 (τ ), x1 (τ ), . . . , xn−1 (τ ) at moment τ , when the system is in a stable state, in order to compute F(x0 (τ ), x1 (τ ), . . . , xn−1 (τ )). Without loss of generality, we can try to estimate the value of x0 (τ ), for instance, by measuring the respective parameter at time τ . Although, for some systems, we can acquire the value of x0 (τ ) easily in this way, the consequences for the entire system can be dramatic. Unfortunately, any measurement is an external perturbation for the system, and in the process, the parameter subjected to measurement may be affected unpredictably. Thus, the measurement operation will change the state of the system from (x0 (τ ), x1 (τ ), . . ., xn−1 (τ )) to (x0 (τ ), x1 (τ ), . . . , xn−1 (τ )), where x0 (τ ) denotes the value of variable x0 after measurement. In those cases where the measurement process has a non-deterministic effect upon the variable being measured, we cannot estimate x0 (τ )

page 145

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

146

in any way. But, regardless of the particular instance of the model, the transition from (x0 (τ ), x1 (τ ), . . . , xn−1 (τ )) (that is, the state before measurement) to (x0 (τ ), x1 (τ ), . . . , xn−1 (τ )) (that is, the state after measurement) does not correspond to the normal evolution of the system according to its dynamics described by functions fi , 0 ≤ i < n. However, because the equilibrium state was perturbed by the measurement operation, the system will react with a series of state transformations, governed by Equations (4.11). Thus, at each time step after τ , the parameters of the system will evolve either towards a new equilibrium state or maybe fall into a chaotic behavior. In any case, at time τ + 1, all n variables have acquired new values, according to the expressions of functions fi : x0 (τ + 1) = f0 (x0 (τ ), x1 (τ ), . . . , xn−1 (τ )) x1 (τ + 1) = f1 (x0 (τ ), x1 (τ ), . . . , xn−1 (τ )) .. .

(4.12)

xn−1 (τ + 1) = fn−1 (x0 (τ ), x1 (τ ), . . . , xn−1 (τ )) Consequently, unless we are able to measure all n variables, in parallel, at time τ , some of the values composing the equilibrium state (x0 (τ ), x1 (τ ), . . . , xn−1 (τ )) will be lost without any possibility of recovery. It is important to emphasize that the computational paradigm to which the above setting belongs is not a conventional one. The input data necessary to compute F are not available at the outset and have to be acquired through measurement operations. Of course, labeling the process of obtaining the necessary information as a computation may be a bit surprising if one is accustomed to seeing computation from the conventional point of view (like, for example, performing a basic arithmetic operation on a pair of numbers). However, the qualitatively new ways of manipulating information nowadays is forcing us to challenge the limitations of the classical computational paradigm

page 146

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

147

and adopt a broader non-classical perspective on computation.90 Here, computation is regarded as the entire process of receiving input from the environment, manipulating it, and applying the output to the environment–all fundamental, necessary, and inseparable steps of information processing (see Sections 4.4 and 4.5). From this new perspective, a computing machine is seen as an open system whose output depends on its interaction with the outside world, a system capable of taking on new information (either communicated to it by an external agent or acquired directly through measurements). The emergence of this new model of computation is motivated by applications as diverse as data acquisition in signal processing91 and the control of nuclear power plants.92 Furthermore, such a computational paradigm can be realized through various physical means including, of course, a quantum mechanical one.34 4.8.2. Quantum distinguishability Another example of a task which cannot be successfully completed unless a parallel approach is employed is now exhibited. The task is to distinguish among the elements of a set of quantum states, using any quantum measurements that can be theoretically applied. There are no restrictions concerning the number of measurements allowed or the time when the task has to be completed. It is shown that there exists a set of entangled states, forming an orthonormal basis in the state space spanned by n qubits, for which only a joint measurement (in that respective basis) of all the qubits composing the system can achieve perfect distinguishability. An important characteristic of the task is that if the degree of parallelism necessary to successfully solve the problem is not available, then the solution is no better than a purely sequential approach. This problem, arising in quantum information theory, also strengthens the conclusion that there is no finite computing device (conventional or unconventional) upon which the attribute universal can be bestowed. The problem of distinguishing among entangled quantum states is a quantum mechanical instance of the formalism detailed in Section 4.8.1, namely, that of measuring interdependent variables. Suppose we have a fixed set of quantum states described using the

page 147

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

148

usual Dirac notation |Ψi (1 ≤ i ≤ n) known to both Alice and Bob. Alice randomly chooses a state from the set and prepares a qubit (or set of qubits) in that particular state. She then gives the qubit(s) to Bob who is free to investigate them in any way he likes. To be more specific, Bob can apply any kind of measurement on the qubit(s) and possibly process and/or interpret the information acquired through measurement. In the end, his task is to identify the index i of the state characterizing the qubit(s) Alice has given him. The only case in which a set of quantum states can be reliably (that is, 100% of the time) distinguished from one another is if they are pairwise orthogonal. Now consider the case in which we try to distinguish among the four Bell states 1 √ |00 + 2 1 √ |01 + 2

1 √ |11, 2 1 √ |10, 2

1 √ |00 − 2 1 √ |01 − 2

1 √ |11, 2 1 √ |10, 2

by resorting only to direct quantum measurements (in other words, no quantum transformations are possible before a measurement). In these circumstances, any sequential approach (that is, measuring the qubits one after the other) will be of no help here, regardless of the basis in which the measurements are performed. By measuring the two qubits, in sequence, in the computational basis, Bob can distinguish the states √12 (|00 ± |11) from √12 (|01 ± |10). He does this by checking if the outcomes of the two measurements are the same or not. But this kind of measurement makes it impossible to differentiate between √12 (|00 + |11) and √12 (|00 − |11), or between √1 (|01 2

+ |10) and √12 (|01 − |10). Alternatively, Bob can decide to perform his measurements in a different basis, like (|+, |−), where the basis vectors are 1 |+ = √ |0 + 2 1 |− = √ |0 − 2

1 √ |1, 2 1 √ |1. 2

page 148

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

Exclusively Quantum

149

Due to the fact that | + + + | − − |00 + |11 √ √ = 2 2 and | + − + | − + |00 − |11 √ √ = , 2 2 Bob can now reliably distinguish the quantum state

√1 (|00 2

+ |11)

from √12 (|00 − |11). Indeed, if the two qubits yield identical outcomes when measured in this new basis, then we can assert with certainty that the state was not √12 (|00 − |11). Similarly, if the measurement outcomes for the qubits are different, the original state could not have been √12 (|00 + |11). Unfortunately, in this new setup, the quantum states

√1 (|00 2

+ |11) and

become indistinguishable and the same is true about √1 (|01 2

√1 (|01 + |10) 2 √1 (|00 − |11) 2

− |10). The computational bases (|0, |1) and (|+, |−) are, respectively, the two extremities of an (theoretically) infinite number of choices for the basis relative to which the quantum measurements are to be performed. But even though the separation line between the four Bell states will drift with the choice of the basis vectors, the two extreme cases discussed above offer the best possible distinguishability. Intuitively, this is due to the entanglement exhibited between the two qubits in all four states. As soon as the first qubit is measured (regardless of the basis), the superposition describing the entangled state collapses to the specific state consistent with the measurement result. In this process, some of the information originally encapsulated in the entangled state is irremediably lost. Consequently, measuring the second qubit cannot give a complete separation of the four EPR states. But the Bell states do form an orthonormal basis, which means that (at least theoretically) they can be distinguished by an appropriate quantum measurement. However, this measurement must be a joint measurement of both qubits

and

page 149

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

150

simultaneously, in order to achieve the desired distinguishability. Not surprisingly, this is very difficult to accomplish in practice. Superdense Coding. The distinguishability of the four Bell (or EPR) states is the key feature in achieving superdense coding.93 However, in the experimental demonstration of this protocol94 two of the possibilities cannot be distinguished from one another, precisely because of the difficulties associated with implementing a joint measurement. 4.8.2.1. Generalization A more compact representation of the Bell basis is through a square matrix where each column is a vector describing one of the Bell states: ⎛ ⎞ 1 0 0 1 ⎜ ⎟ 0⎟ 1 ⎜0 1 1 ⎟. √ ⎜ ⎟ 2⎜ ⎝0 1 −1 0 ⎠ 1 0 0 −1 The elements of each column are the amplitudes or proportions in which the computational basis states |00, |01, |10 and |11 are present in the respective EPR state. This scenario can be extended to ensembles of more than two qubits. The following matrix describes eight different entangled states that cannot be reliably distinguished unless a joint measurement of all three qubits involved is performed: ⎛

1 0

0 0

0

0

0

1

0 0

0

0

1

0

1 0

0

1

0

0

0 1

1

0

0

0

0 1 −1

0

0

0

1 0

0

−1

0

1

0 0

0

0

−1

1 0

0 0

0

0

0

⎜ ⎜0 ⎜ ⎜0 ⎜ ⎜ 0 1 ⎜ √ ⎜ ⎜ 2 ⎜0 ⎜ ⎜0 ⎜ ⎜ ⎝0

1

⎞

⎟ 0⎟ ⎟ 0⎟ ⎟ ⎟ 0⎟ ⎟. 0⎟ ⎟ ⎟ 0⎟ ⎟ ⎟ 0⎠ −1

page 150

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

151

In general, for a quantum system composed of n qubits, one can define the following 2n entangled states of the system: 1 √ (|000 · · · 0 ± |111 · · · 1) 2 1 √ (|000 · · · 1 ± |111 · · · 0) 2 .. .

(4.13)

1 √ (|011 · · · 1 ± |100 · · · 0) 2 These vectors form an orthonormal basis for the state space corresponding to the n-qubit system. The only chance to differentiate among these 2n states using quantum measurement(s) is to observe the n qubits simultaneously, that is, perform a single joint measurement of the entire system. In the given context, joint is really just a synonym for parallel. Indeed, the device in charge of performing the joint measurement must posses the ability to “read” the information stored in each qubit, in parallel, in a perfectly synchronized manner. In this sense, at an abstract level, and just for the sake of offering a more intuitive understanding of the process, the measuring apparatus can be viewed as having n probes. With all probes operating in parallel, each probe can “peek” inside the state of one qubit, in a perfectly synchronous operation. The information gathered by the n probes is seen by the measuring device as a single, indivisible chunk of data, which is then interpreted to give one the 2n entangled states as the measurement outcome. From a mathematical (theoretical) point of view, such a measurement operator can be easily constructed by defining each of the 2n states that are to be distinguished to be a projector associated with the measurement operation. It is true, however, that a physical realization of this mathematical construction is extremely difficult, if not impossible to achieve in practice, with today’s technology. The experimental demonstration of the superdense coding protocol mentioned above at the end of Section 4.8.2 clearly shows this difficulty (for just two qubits!). Yet, if there is any hope to see a joint measurement performed in the future, then only a device operating

page 151

August 2, 2021

152

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

in a parallel synchronous fashion on all n qubits (as explained above) would succeed. It is perhaps worth emphasizing that if such a measurement cannot be applied then the desired distinguishability can no longer be achieved regardless of how many other measuring operations we are allowed to perform. In other words, even an infinite sequence of measurements, each touching at most n − 1 qubits at the same time, cannot equal a single joint measurement involving all n qubits. Furthermore, with respect to the particular distinguishability problem that we have to solve, a single joint measurement capable of observing n−1 qubits simultaneously offers no advantage whatsoever over a sequence of n − 1 consecutive single qubit measurements. This is due to the fact that an entangled state like 1 √ (|000 · · · 0 + |111 · · · 1) 2 cannot be decomposed neither as a product of n − 1 individual states nor as a product of two states (one describing a single qubit and the other describing the subsystem composed of the remaining n − 1 qubits). Any other intermediate decomposition is also impossible. Overall, our distinguishability problem can only be tackled successfully within a parallel approach, where we can measure all qubits simultaneously. In this sense, distinguishing among entangled quantum states can be viewed as a quantum variant of the measurecompute-set problem formulated in Ref. [48], which also admits only a parallel solution. The inherent parallelism characterizing the task of distinguishing among entangled quantum states through measurements implies that a device capable of measuring at most n qubits simultaneously (where n is a fixed, finite number) will fail to solve the distinguishability problem for n + 1 qubits. For this reason, our example joins the other paradigms illustrated in quantum mechanical terms throughout this chapter, to support the idea advanced in Ref. [15] about the impossibility of realizing the concept of a Universal Computer. Conceptually, distinguishing among entangled quantum states is a quantum example of measuring interdependent variables. In this particular quantum instance, the interdependence between variables

page 152

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

153

takes the form of entanglement between qubits, the phenomenon ultimately responsible for making a parallel approach imperative. But not only measuring entangled states requires a parallel solution, quantum evolutions that have to maintain a certain entangled state may also resort to parallelism in order to achieve their goal. In Section 4.8.5 entanglement is investigated as a global mathematical constraint that has to be satisfied throughout a quantum computation. 4.8.3. Another approach to distinguishability Suppose that we resort to the processing capabilities of a quantum computer, as opposed to its measurement ones. Unitary operators preserve inner products, so any unitary evolution of the system described by (4.13) will necessarily transform it into another orthonormal basis set. Therefore, a unitary transformation must exist that will allow a subsequent measurement in the standard computational basis without any loss of information. The following result shows that such a transformation not only exists, but that in fact it can be implemented efficiently. Distinguishability Theorem: The transformation between the following two orthonormal basis sets for the state space spanned by n qubits: 1 √ (|000 · · · 0 + |111 · · · 1) ←→ |000 · · · 0, 2 1 √ (|000 · · · 0 − |111 · · · 1) ←→ |111 · · · 1, 2 1 √ (|000 · · · 1 + |111 · · · 0) ←→ |000 · · · 1, 2 1 √ (|000 · · · 1 − |111 · · · 0) ←→ |111 · · · 0, 2 .. . 1 √ (|011 · · · 1 + |100 · · · 0) ←→ |011 · · · 1, 2 1 √ (|011 · · · 1 − |100 · · · 0) ←→ |100 · · · 0, 2

(4.14)

page 153

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

154

Figure 4.8.

Quantum circuit for distinguishability theorem.

can be realized by a quantum circuit comprising only a linear number of controlled-NOT and Hadamard gates. Proof. It is easy to check that the circuit depicted in Figure 4.8 performs the required quantum transformation for the case n = 4. Shown in what follows are the intermediate quantum states for the particular input 1 1 |Φ0 = √ |0000 + √ |1111. 2 2 These are: 1 |Φ1 = √ |0000 + 2 1 |Φ2 = √ |0000 + 2 1 |Φ3 = √ |0000 + 2

1 √ |1110, 2 1 √ |1100, 2 1 1 1 √ |1000 = ( √ |0 + √ |1) ⊗ |000, 2 2 2

|Φ4 = |Φ5 = |Φ6 = |Φf = |0000. The generalization to an arbitrary number of qubits is straightforward. In the general case the circuit consists of 2n−2 controlled-NOT gates and one Hadamard gate. Due to its symmetric nature, the same quantum circuit can also perform the inverse transformation, from the normal computational basis set to the entangled basis set.

page 154

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

155

By applying the transformation realized by this circuit, the quantum computer can disentangle the qubits composing the system and thus make the act of measuring each qubit entirely independent of the other qubits. This is possible because the final states (after the transformation) are actually classical states which can be interpreted as the indices corresponding to the original entangled quantum states. Obtaining the correct answer to the distinguishability problem amounts to accurately computing the index associated with the given input state. The procedure detailed above gives us a reliable way to do this, 100% of the time. In other words, the function is efficiently computable (in quantum linear time) by a quantum computer. Can the classical computer replicate the operations performed by the quantum machine? We know that a classical computer can simulate (even if inefficiently) the continuous evolution of a closed quantum system (viewed as a quantum computation in the case of an ensemble of qubits). So, whatever unitary operation is invoked by the quantum computer, it can certainly be simulated mathematically on a Turing machine. The difference resides in the way the two machines handle the uncertainty inherent in the input. The quantum computer has the ability to transcend this uncertainty about the quantum state of the input system by acting directly on the input in a way that is specific to the physical support employed to encode or describe the input. The classical computer, on the other hand, lacks the ability to process the information at its original physical level, thus making any simulation at another level futile exactly because of the uncertainty in the input. It is important to emphasize that had the input state been perfectly determined, then any transformation applied to it, even though quantum mechanical in nature, could have been perfectly simulated using the classical means available to a Turing machine. However, in our case, the classical computer does not have a description of the input in classical terms and can only try to obtain one through direct measurement. This will in turn collapse the superposition characterizing the input state, leaving the classical computer with only a 50% probability of correctly identifying the original quantum

page 155

August 2, 2021

156

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

state. This means that the problem cannot be solved classically, not even by a PTM. There is no way to improve the 50% error rate of the classical approach to distinguish among the 2n states. So this problem tells us that what draws the separation line between a quantum and a classical computer, in terms of computational power, is not the ability to extract information from a quantum system through measurements, but the ability to process information at the physical level used to represent it. For the distinguishability problem discussed, this is the only way to deal with the nondeterminism introduced by superposition of states. At this point, it is imperative to make clear the implications of the present discussion on the Church–Turing thesis. The definition of the Turing machine was an extraordinary achievement in abstracting out computation as an information manipulating process. But although the model was thought to be free of any physical assumptions, it is clear today that the description of the Turing machine harbors an implicit assumption: the information it manipulates is classical. Computation is a physical process and the Turing machine computes in accord with the laws of classical physics (it was Rolf Landauer who first noted that “Information is physical”, and Seth Lloyd extended this observation, stating that “Information processing is also physical”95 ). However, the success of quantum mechanics in explaining the reality of the micro-cosmos is challenging our traditional views on information processing, forcing us to redefine what we mean by computation. In the context of quantum computation, the data in general, and the input in particular, are not restricted to classical, orthogonal values, but can be arbitrary superpositions of them. Therefore, computational problems like distinguishing among entangled quantum states are not an attack on the validity of the Church–Turing thesis, but rather they precisely define its scope. As illustrated in Figure 4.9, it is in these terms that the main result of this chapter has to be understood: The set of functions computable by a classical Turing machine is a proper subset of those computable using a quantum Turing machine.

page 156

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

Exclusively Quantum

157

example 4.1

distinguishing among entangled quantum

ex am

computable functions

pl e4 .2

states classically

Quantum-mechanically computable functions Figure 4.9.

Relationship between quantum and classical computation.

4.8.4. Some consequences Distinguishing among entangled quantum states forms the basic building block for a series of information processing tasks that can only be accomplished by a quantum computer. Here are two such examples. 4.8.4.1. Conveying quantum information through a classical channel The first example (Example 4.1) addresses the problem of transmitting unknown quantum information through a classical channel. In the general case, when we have no knowledge whatsoever about the quantum state to be transmitted, the task is obviously impossible. It requires a classical description of the quantum state, which cannot be obtained since a quantum measurement would ruin the original state and cloning an unknown quantum state was proven to be impossible. Quantum teleportation actually requires the existence of a classical channel between the source and the destination, so it could be interpreted as the transmission of an unknown quantum state through a classical channel. There is an important point to make,

page 157

August 2, 2021

158

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

however. Quantum teleportation refers only to a single qubit and requires an EPR state to be shared by the sender and the receiver prior to the teleportation. This entangled pair of qubits is actually a resource that will be consumed in the process. As mentioned earlier, the same argument can be formulated in the case of another task that is not possible through classical means, namely, superdense coding. Unlike these remarkable applications of entanglement as a physical resource, the information processing tasks investigated in this chapter do not assume the creation and distribution of entanglement in order to be completed. After this necessary clarification, it is appropriate to note that the problem investigated in Example 4.1 is unsolvable (in its most general formulation) by both our classical and quantum computers. However, if we restrict the unknown quantum state to be a member of the set (4.13), then the task is only out of the capabilities of the classical machine. The quantum computer can still use the circuit in Figure 4.8 to obtain a “label” of the original quantum state in classical terms, which can be subsequently transmitted via the classical channel. At the other end, the same quantum circuit can reconstruct the original quantum state, based on the classical information received. 4.8.4.2. Protecting quantum information from classical attacks The second example (Example 4.2) is taken from the field of cryptography and gives a more concrete representation of the physical limitations of a classical computer to process information. A simple protocol may be devised to enable the transmission of information through a quantum channel, without any possibility of eavesdropping from a third party resorting only to the computational power of a classical computer. For this purpose, each pair of qubits transmitted through the channel encodes one bit of information in the following way: either √12 (|00 + |11) or √12 (|01 + |10) represents a 0 bit, while

either √12 (|00−|11) or √12 (|01−|10) represents the bit 1. Since the information is encoded (hidden) in the relative phase between the two terms of the superposition, no single-qubit measurements (including those performed in other bases than the standard computational

page 158

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

Exclusively Quantum

159

basis) are better than just flipping a coin in order to guess the bit transmitted, so no information whatsoever can be gained by the classical machine. The quantum computer would be in the same situation if it would resort only to its measurement abilities. However, the quantum computer can first “evolve” the Bell basis into the normal computational basis (using the quantum circuit from Figure 4.8 for the case n = 2) and then identify the bit transmitted by reading the measurement outcome for the first qubit. Note that this protocol can be generalized to “beat” any classical computer endowed with finite measuring capabilities. If the classical computer is able to perform a joint measurement of k qubits (where k is unbounded, but finite) then it suffices to encode a bit of information into the relative phase of an entangled quantum state comprising k + 1 qubits. In this way, the information conveyed through the quantum channel is safely kept out of reach for the classical computer, due to its limitations in processing information at the very physical level chosen to embody it. 4.8.5. Transformations obeying a global constraint Some computational problems require the transformation of a mathematical object in such a way that a property characterizing the original object is to be maintained at all times throughout the computation. This property is a global condition on the variables describing the input state and it must be obeyed at every intermediate step in the computation, as well as for the final state. Geometric flips, map recoloring and rewriting systems are three examples of transformations that can be constrained by a global mathematical condition.22 Here, it is shown that some quantum transformations acting on entangled states may also be perceived as computations obeying a global mathematical constraint. Consider, for example, an ensemble of n qubits sharing the following entangled state: 1 1 √ |000 · · · 0 + √ |111 · · · 1. 2 2

(4.15)

page 159

August 2, 2021

160

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

The entanglement characterizing the above state determines a strict correlation between the values observed in case of a measurement: either all qubits are detected in the state 0 or they are all seen as 1. Suppose that this correlation has to be maintained unaltered, regardless of the local transformations each of the qubits may undergo. Such a transformation may be the application of a NOT quantum gate to any of the qubits forming the ensemble. After such an event, the particular entangled state given in (4.15) is no longer preserved and as a consequence, the correlation between the qubits will be altered. The qubit whose state was “flipped” will be observed in the complementary state, with respect to the other qubits. The global mathematical constraint is no longer satisfied. Parallelism can once again make the difference and help maintain the required entangled state. If, at the same time one or more of the qubits are “flipped”, we also apply a NOT gate to all remaining qubits, simultaneously, then the final state coincides with the initial one. In this way, although the value of each qubit has been switched, the correlation we were interested to maintain remains the same. Also note that any attempt to act on fewer than n qubits simultaneously is doomed to failure. The state given in (4.15) is not the only one with this property. Any entangled state from the orthonormal basis set (4.13) could have been used in the example presented above. The correlation among the qubits would have been different, but the fact that applying a NOT gate, in parallel, to all qubits does not change the quantum state of the ensemble is true for each entangled state appearing in system (4.13). Perhaps the scenario described above can be extended to other quantum transformations beside the NOT gate. Another, perhaps more interesting generalization would be a quantum computation that has to maintain entanglement as a generic, global mathematical constraint and not a specific type of entanglement with a particular correlation among the qubits involved. Such a computation would allow entanglement to change form, but the mathematical definition of entanglement would still have to be obeyed at each step, with each transformation.

page 160

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

Exclusively Quantum

161

4.9. Conclusion Quantum exclusivity was introduced in this chapter, as a more powerful property of computation than quantum “supremacy”, and a number of exclusively quantum computational problems were reviewed. Each of these computations is straightforward to perform on a quantum computer. None of these of computations can be carried out in principle, whether efficiently or inefficiently, on a Turing machine, thereby violating both the Church–Turing thesis as well as the extended Church–Turing thesis. More fundamentally, these computations demonstrate that the Principle of Universality in computation is also invalid, meaning that a Universal Computer cannot exist. A summary of the exclusively quantum computational problems described in this chapter, and the evolving computational paradigms to which they belong is presented in Table 4.1. In each of these paradigms, the information is encoded and processed using quantum mechanical means. As well, in each case a parallel approach offers the only way of seeing the task accomplished. This proves that parallelism as a concept, transcends the boundaries

Table 4.1. Paradigm 1. Rank-varying complexity 2. Time-varying complexity 3. Time-varying variables 4. Interacting variables

5. Computations obeying a global constraint

Exclusively quantum computations. Description The complexity of a computational step is a function of its rank. The complexity of a step depends on when it is executed. Input variables change their values with time. Input data are interconnected, affecting each other’s behavior. A certain global property has to be maintained throughout the computation.

Quantum example Quantum Fourier Transform Quantum error correction Quantum decoherence Measuring entangled states Maintaining entanglement

page 161

August 2, 2021

162

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

imposed by a particular way of representing and transforming information. The computational problems addressed herein clearly demonstrate the value of a parallel solution for quantum computation and information, confirming the capital role played by parallelism in the theory of computation. It is worth emphasizing again that the reference here is to the common understanding of the term parallelism and not to quantum parallelism. The latter phrase is used to denote the ability to perform a certain computation simultaneously on all terms of a quantum superposition, regardless of the number of qubits composing the quantum register whose state is described by that superposition. As opposed to this interpretation, the meaning attributed to parallelism here is the ability to act simultaneously on a certain number of qubits. Thus, it can rightfully be asserted that parallelism transcends the laws of physics and represents a fundamental aspect of computation, regardless of the particular physical way chosen to embody information. A more subtle connection exists between parallelism and the hypothetical notion of a Universal Computer — a machine with fixed and finite characteristics, capable of simulating any other computing device. One thing common to all examples presented in this chapter is that if the degree of parallelism required to solve a certain problem is not available, then no approach can be successful. To be more precise, if n processing units are needed to solve a problem, then a machine endowed with n − 1 processors is not able to complete the task. In other words, such a machine is not capable of simulating the successful parallel algorithm running on the n-processor device (even if it is given an unbounded amount of time and memory to perform the simulation). And since the Principle of Simulation is the one supporting the myth of a Universal Computer, it must be concluded that the existence of such a machine is impossible. Also deserving attention is the unconventional aspect of the computing paradigms responsible for uncovering this result on universality. This further motivates the study of non-traditional computational environments and proves that sometimes the results can be surprising.

page 162

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

163

When he devised the quantum Turing machine as the first abstract model of quantum computation, David Deutsch already pointed to some features that set it apart from the classical Turing machine: intrinsic genuine non-determinism and entanglement. Naturally, these features have created a lot of speculations about the superiority of the quantum computer in terms of computability and complexity. Consequently, the computational powers of the quantum and classical machine have been evaluated and compared in a variety of contexts. This chapter shows that there is a whole class of information processing tasks relative to which a clear separation line exists between quantum computers and classical computers with respect to their computational powers. The set of problems solvable by classical means is therefore strictly smaller than the set of functions computable through quantum means. At the heart of this separation lies a problem (namely, distinguishing among entangled quantum states) that combines uncertainty and entanglement in a way that renders a classical simulation of the quantum solution impossible. Otherwise, taken separately, uncertainty can be dealt with (through measurements) in the absence of entanglement, while entanglement, as a particular case of superposition, can be simulated by a classical machine for the purpose of computation (due to the linearity of the unitary operators describing quantum transformations). While quantum measurements are certainly required to distinguish among different quantum states, this is most emphatically not what gives the quantum Turing machine the advantage over the classical Turing machine. Also, this superiority is not due to some theoretical property specific to “hypercomputers”, which breaks in one way or another the finiteness condition by implicitly assuming some form of unlimited computational resources. It is also not a matter of complexity, the ability to solve problems much faster than it is possible classically. This chapter shows that quantum computers are better than classical ones (whether deterministic or probabilistic) in terms of computability (function evaluation) due to the power conferred to their computations by the way they represent information at the physical level.

page 163

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

164

b4205-v1-ch04

S. G. Akl

Classical physics is just a special case of quantum mechanics, an approximation that is useful in some circumstances. Sometimes, information encoded in genuine quantum mechanical terms cannot be successfully manipulated unless the computing device has the power to process this information directly at the physical level used to represent it. The properties of the physical level chosen to embody information in a computational model ultimately determine its computational capabilities and power, because information is intrinsically physical and cannot be abstracted away from its physical support. The limitations of the classical Turing machine are therefore purely physical. So, is a machine that computes following the principles of quantum mechanics really more powerful than a computing device designed in accord with classical physics? This chapter endeavored to prove that the answer is definitely affirmative. And the difference is made by those problems, defined in purely quantum mechanical terms, whose quantum solutions are impossible to simulate classically. References 1. F. Arute, et al., Quantum supremacy using a programmable superconducting processor. Nature 574, pp. 505–510 (2019). 2. E. Pednault, J. Gunnels, D. Maslov, and J. Gambetta, On quantum supremacy, October 21, 2019. https://www.ibm.com/blogs/research/2019/ 10/on-quantum-supremacy/ 3. E. Pednault, J. A. Gunnels, G. Nannicini, L. Horesh, and R. Wisnieff, Leveraging secondary storage to simulate deep 54-qubit Sycamore circuits (2019). https://arxiv.org/abs/1910.09534 4. S. Aaronson, Shtetl-Optimized: The Blog of Scott Aaronson (2019).https:// www.scottaaronson.com/blog/?p=4372 5. M. Bradley, Google claims to have invented a quantum computer, but IBM begs to differ. The Conversation, January 20, 2020. https://theconv ersation.com/google-claims-to-have-invented-a-quantum-computer-but-ibmbegs-to-differ-127309 6. E. Conover, Google claimed quantum supremacy in 2019 — and sparked controversy. Science News, December 16, 2019. https://www.sciencenews.org/ article/google-quantum-supremacy-claim-controversy-top-science-stories2019-yir 7. L. Crane, Google’s quantum supremacy algorithm has found its first practical use. New Scientist, 13 December 2019. https://www.newscientist.com/ar ticle/2227490-googles-quantum-supremacy-algorithm-has-found-its-first-pra ctical-use/

page 164

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

165

8. R. Goodwins, Google and IBM square off in Schrodinger’s catfight over quantum supremacy. The Register, January 9, 2020. https://www.theregister.co. uk/2020/01/09/google and ibm square off in schrodingers catfight/ 9. C. Lee, Why I dislike what quantum supremacy is doing to computing research. Ars Technica, 23 December 2019. https://arstechnica.com/ science/2019/12/optical-quantum-computer-goes-big-in-new-quest-for-quan tum-supremacy/ 10. H. McCracken, Why Google’s ‘Sputnik’ moment isn’t the last word in quantum supremacy. Fast Company, November 27, 2019. https://www. fastcompany.com/90421236/quantum-supremacy-isnt-a-moment-and-thatsokay 11. D. Phillips, Quantum computing boost for IBM but Bitcoin stays safe. Decrypt, January 9, 2020. https://decrypt.co/16211/quantum-computingboost-for-ibm-but-bitcoin-stays-safe 12. H. Wang et al., Boson sampling with 20 input photons and a 60-mode interferometer in a 1014 -dimensional Hilbert space. Phys. Rev. Lett. 123, 250503 (2019). doi: 10.1103/PhysRevLett.123.250503. 13. S. G. Akl, Non-Universality in Computation: The Myth of the Universal Computer (School of Computing, Queen’s University, 2005). http://research. cs.queensu.ca/Parallel/projects.html 14. S. G. Akl, A Computational Challenge (School of Computing, Queen’s University, 2005). http://www.cs.queensu.ca/home/akl/CHALLENGE/A Computational Challenge.htm 15. S. G. Akl, The myth of universal computation. In R. Trobec, P. Zinterhof, M. Vajterˇsic, and A. Uhl (eds.), Parallel Numerics, Part 2, Systems and Simulation, University of Salzburg, Austria and Joˇzef Stefan Institute, Ljubljana, Slovenia, (2005), pp. 211–2360. 16. S. G. Akl, Universality in computation: Some quotes of interest, Technical Report No. 2006-511, School of Computing, Queen’s University (2006). http://www.cs.queensu.ca/home/akl/techreports/quotes.pdf 17. S. G. Akl, Three counterexamples to dispel the myth of the universal computer. Parallel Process. Lett. 16(3), 381–403 (2006). 18. S. G. Akl, Conventional or unconventional: is any computer universal? In A. Adamatzky and C. Teuscher (eds.), From Utopian to Genuine Unconventional Computers (Luniver Press, 2006), pp. 101–136. 19. S. G. Akl, G¨ odel’s incompleteness theorem and nonuniversality in computing. In M. Nagy, and N. Nagy (eds.), Proceedings of the Workshop on Unconventional Computational Problems, Sixth International Conference on Unconventional Computation, Kingston, Ontario, (2007), pp. 1–23. 20. S. G. Akl, Even accelerating machines are not universal. Int. J. Unconvent. Comput. 3(2), 105–121 (2007). 21. S. G. Akl, Unconventional computational problems with consequences to universality. Int. J. Unconvent. Comput. 4(1), 89–98 (2008). 22. S. G. Akl, Evolving computational systems. In S. Rajasekaran and J. H. Reif (eds.), Parallel Computing: Models, Algorithms, and Applications (Taylor and Francis, 2008), pp. 1–22.

page 165

August 2, 2021

166

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

23. S. G. Akl, Ubiquity and simultaneity: the science and philosophy of space and time in unconventional computation, Keynote address. In Conference on the Science and Philosophy of Unconventional Computing (The University of Cambridge, Cambridge, 2009). 24. S. G. Akl, Time travel: A new hypercomputational paradigm. Int. J. Unconvent. Comput. 6, (5), 329–351 (2010). 25. S. G. Akl, What is computation? International Journal of Parallel, Emergent and Distributed Systems 29(4), 337–345 (2014). 26. S. G. Akl, Nonuniversality explained, Int. J. Parallel, Emergent Distrib. Syst. 31(3), 201–219 (2016). 27. S. G. Akl, Nonuniversality in computation: Fifteen misconceptions rectified. In A. Adamatzky (ed.), Advances in Unconventional Computing, (Springer, 2017), pp. 1–31. 28. S. G. Akl, Unconventional computational problems, In R. A. Meyers, (ed.), Encyclopedia of Complexity and Systems Science (Springer, 2018), pp. 631–639. 29. S. G. Akl, Unconventional wisdom: Superlinear speedup and inherently parallel computations, In A. Adamatzky (ed.), From Parallel to Emergent Computing (Taylor & Francis, 2018), pp. 347–366. 30. S. G. Akl, From parallelism to nonuniversality: An unconventional trajectory, In: A. Adamatzky and V. Kendon (eds.), From Astrophysics to Unconventional Computing (Springer, 2019), pp. 123–156. 31. S. G. Akl, and N. Salay, On computable numbers, nonuniversality, and the genuine power of parallelism. Int. J. Unconv. Comput. 11(3&4), 283–297 (2015). Also in: A. Adamatzky (ed.), Emergent Computation: A Festschrift for Selim G. Akl (Springer, 2017), pp. 57–69. 32. M. Nagy, and S. G. Akl, On the importance of parallelism for quantum computation and the concept of a universal computer, In: C. S. Calude, M. J. Dinneen, G. P˘ aun, M. J., Pérez-Jiménez, and G., Rozenberg (eds.), Unconventional Computation, (Springer, 2005), pp. 176–190. 33. M. Nagy and S. G. Akl, Quantum computation and quantum information. Int. J. Parallel, Emergent Distrib. Syst. 21(1), 1–59 (2006). 34. M. Nagy and S. G. Akl, Quantum measurements and universal computation. Int. J. Unconv. Comput. 2(1), 73–88 (2006). 35. M. Nagy and S. G. Akl, Quantum computing: Beyond the limits of conventional computation. Int. J. Parallel, Emergent Distrib. Syst. 22(2), 123–135 (2007). 36. M. Nagy and S. G. Akl, Parallelism in quantum information processing defeats the Universal Computer. Parallel Process. Lett. Spec. Issue Unconv. Comput. Probl. 17(3), 233–262 (2007). 37. M. Nagy and S. G. Akl, Coping with decoherence. Parallel Process. Lett. Spec. Issue Adv. Quantu. Comput. 20(3), 213–226 (2010). 38. M. Nagy and S. G. Akl, Using Quantum Mechanics To Enhance Information Processing (Scholars’ Press, 2012).

page 166

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

167

39. N. Nagy and S. G. Akl, Computations with uncertain time constraints: effects on parallelism and universality, In C. S. Calude, J. Kari, I. Petre, and G. Rozenberg (eds.), Unconventional Computation (Springer, 2011), pp. 152–163. 40. N. Nagy and S. G. Akl, Computing with uncertainty and its implications to universality. Int. J. Parallel, Emergent Distrib. Syst. 27, 169–192 (2012). 41. S. G. Akl, Parallel Computation: Models and Methods (Prentice Hall, 1997). 42. S. G. Akl and M. Nagy, Introduction to parallel computation. In R. Trobec, M. Vajterˇsic, and P. Zinterhof (eds.), Parallel Computing: Numerics, Applications, and Trends (Springer-Verlag, 2009), pp. 43–80. 43. S. G. Akl and M. Nagy, The future of parallel computation. In R. Trobec, M. Vajterˇsic, and P. Zinterhof (eds.), Parallel Computing: Numerics, Applications, and Trends (Springer-Verlag, 2009), pp. 471–510. 44. M. Nagy and S. G. Akl, Computing nearest neighbors in real time. J. Parallel Distrib. Comput. 66, 359–366 (2006). 45. N. Nagy and S. G. Akl, The maximum flow problem: A real-time approach. Parallel Comput. 29(6), 767–794 (2003). 46. S. G. Akl, Discrete steepest descent in real time. Parallel Distrib. Comput. Pract. 4(3), 301–317 (2001). 47. S. G. Akl, Superlinear performance in real-time parallel computation. The J. Supercomput. 29, 89–111 (2004). 48. S. G. Akl, Coping with uncertainty and stress: A parallel computation approach. Int. J. High Perform. Computi. Network, 4(1 & 2) 85–90 (2006). 49. S. G. Akl, B. Cordy, and W. Yao, An analysis of the effect of parallelism in the control of dynamical systems. Int. J. Parallel, Emergent Distrib. Syst. 20(2), 147–168 (2005). 50. S. G. Akl, Inherently parallel geometric computations. Parallel Process. Lett. 16(1), 19–37 (2006). 51. P. Dirac, The Principles of Quantum Mechanics, 4th edn, (Oxford University Press, 1958). 52. I. Glendinning, The Bloch sphere. European Centre for Parallel Computing at Vienna (2005). http://www.vcpc.univie.ac.at/∼ ian/hotlist/qc/talks/ bloch-sphere.pdf 53. M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information (Cambridge University Press, 2000). 54. E. Schr¨ odinger, Discussion of probability relations between separeated systems. Proc. Cambridge Philosoph. Soci. 31, 555–563 (1935). 55. J. Bell, On the Einstein–Podolsky–Rosen paradox. Physics 1, 195–200 (1964). 56. A. Einstein, B. Podolsky, and N. Rosen, Can quantum-mechanical description of physical reality be considered complete? Phys. Rev. 47, 777–780 (1935). 57. D. Deutsch, Quantum theory, the Church-Turing principle, and the Universal Quantum Computer. Proc. Roy. Soc. Lond. A, 400, 97–117 (1985). 58. D. Deutsch and R. Jozsa, Rapid solution of problems by quantum computation. Proc. Roy. Soc. Lond. A 439, 553–558 (1992).

page 167

August 2, 2021

168

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch04

S. G. Akl

59. P. W. Shor, Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM J. Comput. 26(5), 1484–1509 (1997). 60. A. K. Lenstra and H. W. Lenstra, Jr. (eds.), The Development of the Number Field Sieve (Springer-Verlag, 1993). 61. R. Feynman, Simulating physics with computers. Int. J. Theoreti. Phys. 21(6&7), 467–488 (1982). 62. E. Benjamin, K. Huang, A. Kamil, and J. Kittiyachavalit, Quantum computability and complexity and the limits of quantum computation (2003). http://www.cs.berkeley.edu/∼kamil/quantum/qc4.pdf 63. E. Bernstein and U. Vazirani, Quantum complexity theory. SIAM J. Comput. 26(5), 1411–1473 (1997). 64. A. Berthiaume and G. Brassard, Oracle quantum computing. J. Modern Opt. 41(12), 2521–2535 (1994). 65. S. Robinson, Emerging insights on limitations of quantum computing shape quest for fast algorithms, SIAM News, 36(1), 1–3 (2003). 66. C. S. Calude and B. Pavlov, Coins, quantum measurements, and Turing’s barrier. Quant. Informa. Process. 1(1&2), 107–127 (2002). 67. T. D. Kieu, Quantum adiabatic algorithm for Hilbert’s tenth problem: I. The algorithm (2003). http://arxiv.org/abs/quant-ph/0310052 68. R. Penrose, The Emperor’s New Mind (Oxford University Press, 1989). 69. R. Penrose, Shadows of the Mind (Oxford University Press, 1994). 70. T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms (MIT Press, 2001). 71. M. Nagy and S. G. Akl, Parallelizing the Quantum Fourier Transform. Nineteenth International Conference on Parallel and Distributed Computing Systems (San Francisco, 2006), pp. 108–113. 72. R. Griffiths and C.-S. Niu, Semiclassical Fourier transform for quantum computation. Phys. Rev. Lett. 76, 3228–3231 (1996). 73. M. Nagy and S. G. Akl, Quantum key distribution revisited, Technical Report 2006-516, School of Computing, Queen’s University, Kingston, Ontario (2006). http://research.cs.queensu.ca/TechReports/Reports/2006-516.pdf 74. M. Nagy, S. G. Akl, and S. Kershaw, Key distribution based on the quantum Fourier transform. Int. J. Sec. Appl. 3(4), 45–67 (2009). 75. N. Nagy, S. G. Akl, and M. Nagy, Applications of Quantum Cryptography (Lambert Academic Publishing, 2016). 76. C. Cohen-Tannoudji, B. Diu, and F. Laloe. Quantum Mechanics, Vol. 1 & 2 (Wiley, 1977). 77. J. Preskill, Fault-tolerant quantum computation, In H.- K. Lo, S. Popescu, and T. Spiller (eds.), Introduction to Quantum Computation and Information (World Scientific, 1998), pp. 213–269. 78. P. W. Shor, Scheme for reducing decoherence in quantum computer memory. Phys. Rev. A 52, pp. 2493–2496 (1995). 79. A. M. Steane, Error correcting codes in quantum theory. Phys. Rev. Lett. 77(5), 793–797 (1996).

page 168

August 2, 2021

17:41

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Exclusively Quantum

b4205-v1-ch04

169

80. C. H. Bennett, D. P. DiVincenzo, J. A. Smolin, and W. K. Wootters, Mixed state entanglement and quantum error correction. Phys. Rev. A 54, 3824– 3851 (1996). 81. R. Laflamme, C. Miquel, J. P. Paz, and W. H. Zurek, Perfect quantum error correction code (1996). http://arxiv.org/abs/quant-ph/9602019 82. A. R. Calderbank and P. W. Shor, Good quantum error-correcting codes exist. Phys. Rev. A 54(2) 1098–1106 (1996). 83. A. M. Steane, Multiple particle interference and quantum error correction. Proc. Roy. Soc. Lond. A 452, 2551–2576 (1996). 84. D. Gottesman, Class of quantum error-correcting codes saturating the quantum hamming bound. Phys. Rev. A 54, 1862–1868 (1996). 85. A. Ekert and C. Macchiavello, Quantum error correction for communication. Phys. Rev. Lett. 77, 2585–2588 (1996). 86. J. Preskill, Reliable quantum computers. Proc. Roy. Soc. Lond. A 454, pp. 385–410 (1998). 87. A. Barenco, A. Berthiaume, D. Deutsch, A. Ekert, R. Jozsa, and C. Macchiavello, Stabilization of quantum computations by symmetrization (1996). http://xxx.lanl.gov/abs/quant-ph/9604028 88. A. Berthiaume, D. Deutsch, and R. Jozsa, The stabilization of quantum computation. In Proceedings of the Workshop on Physics and Computation: PhysComp ’94 (IEEE Computer Society Press, 1994), pp. 60–62. 89. A. Peres, Error symmetrization in quantum computers (1996). http://xxx. lanl.gov/abs/quant-ph/9605009 90. S. Stepney, S. L. Braunstein, J. A. Clark, A. Tyrrell, A. Adamatzky, R. E. Smith, T. Addis, C. Johnson, J. Timmis, P. Welch, R. Milner, and D. Partridge, Journeys in non-classical computation I: A grand challenge for computing research. Int. J. Parallel, Emergent Distrib. Syst. 20, 1, pp. 5–19 (2005). 91. G. Okˇsa, M. Beˇcka, and M. Vajterˇsic, Parallel computation with structured matrices in linear modeling of multidimensional signals. Parallel Distrib. Comput. Pract. 5(3), 289–299 (2004). 92. B. Barut¸cu, S. S ¸ eka, E. Ayaz, and E. T¨ urkcan, Real-time reactor noise diagnostics for the Borsele (PWR) nuclear power plant. Progr. Nucl. Energy 43(1–4), 137–143 (2003). 93. C. H. Bennett and S. J. Wiesner, Communication via one- and two-particle operators on Einstein–Podolsky–Rosen states. Phys. Rev. Lett., 69(20), 2881–2884 (1992). 94. K. Mattle, H. Weinfurter, P. G. Kwiat, and A. Zeilinger, Dense coding in experimental quantum communication. Phys. Rev. Lett. 76(25), 4656–4659 (1996). 95. S. Lloyd, Programming the Universe: A Quantum Computer Scientist Takes On the Cosmos (Knopf, 2006).

page 169

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

© 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 0005

Chapter 5

Estimations of Integrated Information Based on Algorithmic Complexity and Dynamic Querying Alberto Hern´ andez-Espinosa∗, Hector Zenil†,‡,§,¶,∗∗ , Narsis A. Kiani†,‡,§ and Jesper Tegnér‡, ∗

Department of Mathematics, Faculty of Sciences, UNAM, Mexico † Algorithmic Dynamics Lab, Karolinska Institute (KI) ‡ Unit of Computational Medicine, Center for Molecular Medicine, Department of Medicine Solna, KI, Stockholm, Sweden § Algorithmic Nature Group, Laboratory of Scientific Research (LABORES) for the Natural and Digital Sciences, Paris, France ¶ Oxford Immune Algorithmics, Reading, UK Biological and Environmental Sciences and Engineering Division, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Kingdom of Saudi Arabia ∗∗ [email protected] Integrated information has been introduced as a metric to quantify the amount of information generated by a system beyond the information generated by its individual elements. While the metrics associated with the Greek letter φ require the calculation of the interaction of an exponential number of sub-divisions of the system, most of these numerical approaches related to the metric are based on the basics of classical information theory and perturbation analysis. Here we introduce and

171

page 171

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

172

sketch alternative approaches to connect algorithmic complexity and integrated information based on the concept of algorithmic perturbation rooted in algorithmic information dynamics and its concept of programmability. We hypothesize that if an object is algorithmic random or algorithmic simple, algorithmic random perturbations will have little to no effect to the internal capabilities of a system to produce integrated information but when an object is more integrated the object will also display elements able to perturb the object and increase or decrease its algorithmic randomness. We sketch some of these ideas related to an object integrated information value and its algorithmic information content. We propose that such an algorithmic perturbation test quantifying compression sensitivity may provide a system with a means to extract explanations–causal accounts–of its own behavior hence making IIT and associated measure φ more explainable and interpretable. Our technique may reduce the number of calculations to arrive at some estimations with algorithmic perturbation guiding a more efficient search. Our work sets the stage for a systematic exploration and further investigation of the connections between algorithmic complexity and integrated information at the level of both theory and practice.

5.1. Introduction The concept of information has emerged as a language in its own right, bridging several disciplines that analyze natural phenomena and man-made systems. The development of techniques to decipher the structure and dynamics of complex systems is a rich inter-disciplinary research area which is not only of fundamental interest but also important in numerous applications. Broadly speaking, dynamical aspects such as stability and state-transitions within such systems have been of major interest in statistical physics, dynamical systems, and computational neuroscience.1–3 Here, complex systems are defined by a set of nonlinear evolution equations. Cellular automata, spinglass systems, Hopfield networks, and Boolean networks, have for example been used as numerical experimental model systems to investigate the dynamical aspects of complex systems. Due to the complexity of the analysis, notions such as symmetries in the systems, averaging (e.g., mean-field techniques), and separation of time-scales, have all been instrumental in deciphering the core principles at work in such complex systems. In parallel, network science has emerged

page 172

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

173

as a rich interdisciplinary field, essentially analyzing the structure of real networks in different areas of science and in diverse application domains.4 Examples include social, biological and electrical networks, the web, business networks and the interconnected internet. By a structural analysis, which has dominated these investigations, we refer to statistical descriptions of network connectivity. Networks can be described globally, in terms ranging from the degree to which they differ from a random Poisson distribution of links, to their modular organization, including their local properties such as local clustering around nodes, special nodes/links with high degrees of betweenness or serving specific roles in the network, and local motif structures. Such network features can be used to classify and describe similarities and differences between what appear to be different classes of networks across and within different application domains. Finally, due to the rich representational capacity of networks and their usefulness across science, technology, and applications, work in machine learning, in particular graph convolutional networks and embedding techniques, is currently making headway in devising ways to map these non-regular network objects onto a format such that machine learning techniques can be used to analyze their properties.5 Now, we may ask if integrated information theory (IIT) is proposed to be of relevance for the analysis of complex networks, we ask how is IIT related to fundamental questions underpinning research and thinking of complex systems? On the one hand, we find a rich body of work dealing with what could be referred to as technical, computational challenges, and application-driven investigations. For example, which global and local properties should be computed and how to do so in an efficient manner. However, at a more fundamental level we find essentially two challenges, which in our view have a bearing on the core intellectual driving force of complex systems. First: What is the origin of and mechanisms propelling order in complex systems? Secondly, and of major concern for the present paper: Is the whole — in some sense — larger than the sum of its parts? Both questions are vague when formulated in words, as above, but they can readily be technically specified within a model class. The motivation for the second question is that it appears that there are

page 173

August 2, 2021

174

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

indeed phenomena in nature which cannot easily be explained only with reference to their parts, but seem to require that we adopt a holistic view. Since Anderson’s classic 1972 essay, there has been an animated and at times heated discussion of whether there is anything which could be referred to as emergence.6 Tononi and his group have developed a formalism to quantify the amount of information generated by a system — defined as a set of interconnected elements—beyond the information generated by the parts (subsets) of the system. Their motivation was that in order to develop a theory of consciousness.7 In that quest, they perceived a necessity to define a measure which could quantify the amount and degree of consciousness, a measure they refer to as φ, which in turn constitutes the core of Integrated Information Theory or IIT. Importantly, in the present work we distinguish between the issue of the relevance of φ for consciousness versus the technical numerical question of how to calculate φ. Here we address the computation of φ, as it is potentially a means toward a precise formulation for the possible causal relation between a whole and the parts of a system, regardless of its purported relevance to consciousness. To calculate φ, Tononi and collaborators have developed a computational toolbox.8 Yet, calculating φ comes with a severe computational cost, as the calculation scales exponentially with the number of elements in the network. Furthermore, the computation requires knowledge of the transition probabilities of the system, which makes computation of anything larger than small systems of order of one magnitude intractable in practice. The calculation of φ requires a division of the system into smaller subsets, ranging from large pieces down to singletons, every division into k pieces can be instantiated in Nk different ways. Using this procedure from Tononi, elements that have small causal influences on the activity of other elements can be identified. A system with low φ is therefore characterized by the fact that changes in subsets of the system do not affect the rest of the system. Such a system is therefore considered to be a non-integrated system. This observation entails a key insight, namely, that if a system is highly integrated among its parts, then the different parts can be related to each other, or more precisely, they

page 174

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

175

can be used to describe other parts of the system. Then the parts are in some sense simple and should be compressible. This is the observation and intuition behind our method, which employs a formalized notion of complexity to exploit this insight and thereby allow a more efficient, guided search in the space of algorithmic distances, in contrast to exhaustive computations of the distance between statistical distributions, as currently implemented in IIT. Technically we are therefore not required to perform a full computation of what is referred to as the input-output repertoire (see Section 5.3 Methods for technical details). This, in brief, is our motivation for introducing our method, which is based on algorithmic information dynamics.9–11 At its core is a causal perturbation analysis and a measure of sophistication connected to algorithmic complexity. Our approach exploits the idea that causal deterministic systems have a simple algorithmic description and thus a simple generating mechanism sufficient to simulate and reproduce complex systemic behavior. Using this technique we can assess the effect of perturbations, and thereby exploit the fact that, depending on the algorithmic complexity of a system, the perturbation will induce different degrees of change in algorithmic space. In short, a system will be highly integrated only if the removal or perturbation of its parts has a nonlinear effect on the generative program producing the system in the first place. Interestingly, even Tononi suggested early on that algorithmic complexity could be connected to the computation of integrated information.12 However, a lossless compression algorithm was used to approximate φ. Here we contribute to the formalization of such a suggestion by using stronger tools, which we have recently developed, to approximate complexity. At the core of algorithmic information is the concept of minimal program-size and Kolmogorov–Chaitin complexity.13, 14 Briefly, the Kolmogorov– Chaitin complexity K(x) of an object x is the length of the shortest computer program that produces x and halts. K(x) is uncomputable but can be approximated from above, meaning one can find upper bounds by using compression algorithms, or rather more powerful techniques such as those based on algorithmic probability,15–17 given that popular lossless compression algorithms are limited and more

page 175

August 2, 2021

176

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

closely related to classical Shannon entropy than to K itself.19–21 One reason for this state of affairs is that, as demonstrated in Ref. [18], there is a fundamental difference between algorithmic and statistical complexity with respect to how randomness is characterized in opposition to causation. Specifically, algorithmic complexity implies a deterministic description of an object (it defines the algorithmic information content of an individual sequence/object), whereas statistical complexity implies a statistical description (it refers to an ensemble of sequences generated by a certain source). Approaches such as transfer entropy,22 Granger causality,23 and Partial Information Decomposition24, 25 that are based on regression, correlation and/or a combination of regression, correlation and intervention but ultimately relying on probability distributions, fall into this category. Hence for better-founded methods and algorithms for estimating algorithmic complexity, we recommend the use of our tools, which are already being used by independent groups working on, for example, biological modeling,26 cognition27 and consciousness.28 These tools are based on the theory of algorithmic probability, and are not free from challenges and limitations, but they are better connected to the algorithmic side of algorithmic complexity, rather than only to the statistical pattern-matching side that current approaches using popular lossless compression algorithms exploit, making these approaches potentially misleading.19 Our procedure, in brief, is as follows. First, we deduce the rules in systems of interest: we apply the perturbation test introduced in Refs. [9, 11, 29] to ascertain the computational capabilities of networks. Next, simple rules are formalized and implemented to simulate the behavior of these systems. Following this analysis, we perform an automatic procedure, referred to as a meta-perturbation test, which is applied over the behavior obtained by the aforementioned simple rules, in order to arrive at explanations of such behavior. We incorporate the ideas of an interventionist calculus (c.f. Judea Pearl30 ) and perturbation analysis within what we call Algorithmic Information Dynamics, and we go beyond pattern identification using probability theory, classical statistics, and correlation analysis by developing a model-driven

page 176

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

177

approach that is fed by data. This contrasts with a purely datadriven approach, and is a consequence of the fact that our analysis considers the algorithmic distance between models. 5.2. Basic Concepts of Integrated Information Theory Integrated information theory (IIT) postulates that consciousness is identical to integrated information and that a system’s capacity for consciousness can be expressed by a quantitative measure denoted by φ. Tononi defines integrated information as “the amount of information generated by a complex of elements, above and beyond the information generated by its parts” (Consciousness as integrated information: a provisional manifesto46 ) and states, “The integrated information theory (IIT) of consciousness claims that, at a fundamental level, consciousness is integrated information” (Consciousness as integrated information: a provisional manifesto,46 italics in original). IIT aims to explain “relationships between consciousness and the Physical Substrate of Consciousness (PSC), and starts from essential properties of phenomenal experience, and derives the requirements for the physical substrate of consciousness.”31 5.2.1. Calculus of φ The integrated information theory defines integrated information (φ) as the effective information of the minimum information partition (MIP) in a system.7 The MIP is also defined as the partition having minimum effective information among all possible partitions. φ[X; x] =: ϕ[X; x, M IP (x)], M IP (x) =: argminϕ(X; x, P ), where X is the system, x is a state, and P is a partition P = M1 , . . . , Mr . Importantly, identifying the MIP requires searching all possible partitions and comparing their effective information φ. This effective information is specified in terms of effect and causal information, that is, the distance between two probability distributions: one for

page 177

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

178

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

the unpartitioned (unconstrained) partition (this can be the full set of nodes of the whole system or one of its possible partitions) and a partition of this latter. Such probability distributions determine probabilities of all possible future (effect) or past (causal) states of an arbitrary partition being in a current state. This means that comparing one set of nodes that can be the full set of nodes of the system or a subset (partition) of itself with all possible partitions of this set of nodes, MIP represents the partitions with the minimal value of the distance between probability distributions of the set of nodes and one of all its possible partitions. When a set of nodes is chosen to compute effective information, this is referred to as a “mechanism”, and the partition to which it is compared is referred to as the “purview”. The distance between probability distributions is computed by means of an adaptation of the Earth Mover’s Distance (EMD) algorithm, which is a method to evaluate dissimilarity between two multi-dimensional distributions in a given feature space where a distance measure between single features, which we call the ground distance, is given. The EMD “lifts” this distance from individual features to full distributions. Note that EMD is referred to as a Wasserstein metric in mathematics, and is commonly used in machine learning as a natural metric between two distributions.32 Intuitively, given two distributions, one can be seen as a mass of earth properly spread in space, the other as a collection of holes in that same space. Then, the EMD measures the least amount of work needed to fill the holes with earth. Here, a unit of work corresponds to transporting (by an optimal transport method) a unit of earth a unit of ground distance.

5.3. Methods In this section we introduction of the meta-perturbation analysis, additional technical details of which are presented in the appendix. Next, we recap the causal perturbation and causal analysis leading up to the notion of program-size divergence, which is our core metric for

page 178

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

179

how different programs, that is, systems — more or less integrated — respond to perturbations. 5.3.1. Programmability test and meta-test In Ref. [29], a programmability test is introduced which was inspired by the Turing test, while being based on the view that the universe and all physical systems living in it and able to process information can be considered (natural) computers29 equipped with particular computational capabilities.33 The programmability test is explained as: “... replacing the question of whether a system is capable of digital computation with the question of whether a system can behave like a digital computer and whether a digital computer can exhibit the behavior of a natural system.” Then, in the same way that the Turing test proceeds to ask questions of a computer in order to determine whether it is capable of computing an intelligent behavior, the programmability test aims to know what a specific system is capable of computing by means of algorithmic querying.34 In practice, the programmability test is a system perturbation 29, test 35 that “asks” questions of a computational system in the form: what is your output (answer) given this question (input)?. This idea is applied to φK ’s implementation so that once the set of all possible answers of a system is obtained, this set is analyzed and generalized to deduce the rules that should not just offer a picture of its computability capabilities, but also simulate and give an account of the behavior of the system itself. A second step after this perturbation test is to analyze its results in order to construct a computer program — as simple as possible — capable not only of reproducing the output repertoire but also of giving an account of the programmability capabilities of the system itself, that is, rules capable of producing a certain output given an input, and at the same time explaining where, in ordinal terms, such an output could be placed relative to the order of the full output repertoire. This latter aspect we refer to as the meta-perturbation test.

page 179

August 2, 2021

180

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

Then, φK not only applies a perturbation test over a system, but also a meta-perturbation test over results obtained on the first test. The rules found in this meta-test are used not only as compressed specifications or representations of the behavior itself, but also as rules that give a sort of account of the behavior of the system. This can be done because the systems analyzed in IIT are well known, or in other words, since all node-by-node operations are well defined, it is easy to compute all possible outputs (answers) for all possible inputs (questions or queries), corresponding to what in IIT are referred to as repertoires. In the context of φK , a meta-test is applied in order to find the rules that describe the behavior embedded in repertoires of a system, instead of trying to ascertain the rules that define how the system works. A system specified in this manner turns on a “computer”, recording it’s own behavior (e.g., the repertoires) as well as probing itself, for example, the action of φK , in such a manner as to potentially give an account of its own behavior. To make this possible a system specification must be enabled with an explanatory interface based on these simple embedding behavior rules. φK goes beyond the original φ in that the programmability test searches for the rules underlying the behavior of a system rather than generating a description of its possible causal connections. While in IIT these rules are defined a priori and induced by perturbation, φK ’s objective is not only to find rules that simulate, but also describe such behavior in a brief manner (thus simple rules) and make predictions about the behavior of the system. The field of Algorithmic Information Dynamics9, 11 implements this approach by asking what changes to hypothesized outputs mean for the hypothesized underlying program generating an observation, after an induced or natural perturbation. The simple rules discovered and used for the calculation of φK are used here exclusively to compose constrained/unconstrained distributionsused in IIT for obtaining cause-and-effect information, a key concept from which the integration of information derives. The rest of the calculus — earth mover’s distance measurements, the calculus of conceptual spaces, major complex and finding the MIP — remains as specified in IIT 3.0.

page 180

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

181

5.3.2. Causal perturbation analysis From a statistical standpoint, it would be typical to suggest that the behavior of two time series, let’s call them X and Z, would potentially be causally connected if they were statistically correlated. Yet, there are several other cases that would not be distinguishable after a correlation test. A first possibility is that the time series simply shows similar behavior without being causally connected, that is, there is a shared upstream causal driver Y , concealed from the observer. Another possibility is that they are causally connected, but that correlation does not tell us whether it is a case of X affecting Z, or vice versa. Perturbation analysis allows some disambiguation. The idea is to apply a perturbation on one time series and see how the perturbation spreads to the other time series. Perturbing the data-point in position 5 the time series Z as shown in Figure 5.1 multiplying it by –2, X does not respond to the perturbation. This means that for this data point, X remains the same. This suggests that there is no causal influence of Z on X. In contrast, if the perturbation is applied to a value of X, Z changes and follows the direction of the new value, suggesting that the perturbation of X has a causal influence on Z. From behind the scenes, we can reveal that Z is the moving average of X, which means that each value of Z takes two values of X to calculate, and so is a function of X. The results of these perturbations produce evidence in favour of a causal relationship between these processes, if we did not know that they were related by the function we just described. This suggests that it is X which causally precedes Z. So we can say that this single perturbation suggests a causal relationship illustrated in Figure 5.2. There are a number of possible types of causal relationship between three events (see Figure 5.3) that can be represented in what is known as a directed acyclic graph (DAG), that is, a graph that has arrows implying a cause and effect relationship but has no loops, because a loop would make a cause into the cause of itself, or an effect that is also its own cause, something that would be incommensurate with causality. In these graphs, nodes are events

page 181

August 2, 2021

182

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

Figure 5.1. Causal intervention analysis on time series X and Z before and after perturbation in Z (top) and X (bottom). The values of Z come from the moving average of X, so there is a one-way causal relationship: perturbing X has an effect on Z but perturbing Z has no effect on X thereby suggesting the causal relationship.

Figure 5.2. Possible self-loopless causal relationship between two unlabelled variables X and Z.

Figure 5.3. Acyclic path graphs representing all possible self-loopless connected causal relationships among three unlabeled variables.

page 182

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

183

and events are linked to each other if there is a direct cause-andeffect relation. In the first case, labelled A in orange, the event X is the cause of event Y , and Y is the cause of event Z, but X is said to be an indirect cause of Z. In general, we are, of course, always more interested in direct causes, because almost anything can be an indirect cause of anything else. In the second case B, an event Y is a direct cause of both Z and X. Finally, in case C, the event Y has two causes, X and Z. With an interventionist calculus such as the one performed on the time series above, one may rule out some but not all cases, but more importantly, the perturbation analysis offers the means to start constructing a model explaining the system and data rather than merely describing it in terms of simpler correlations. In our approach to integrated information, the idea is to identify the set of most likely generating candidates able to produce certain observed behavior even if such behavior may not carry any statistical regularity and for all purposes appear statistically random.36 Strictly speaking, computational mechanics18, 37 is a framework that bridges statistical inference and stochastic modelling that suggests a model based on an automaton called an -machine. However, such machines are stochastic in nature and, if the methods used to reconstruct such machines rely on statistical methods, the result is only an apparent causal representation with no correspondence between internal states and alleged states of the phenomenon observed. In contrast, approaches based on algorithmic probability as approached by algorithmic information dynamics can complement computational mechanics as they provide means to construct non-stochastic automata models that are independent of probability distributions and are in a strict sense optimal and universal.38, 39 In the case of our two time series experiments, the time series X is produced by the mathematical function f (x) = Sin(x), and thus Sin(x) is the generating mechanism of time series X. On the other hand, the generating mechanism of Z is M ovAvg(f (x)), and clearly M ovAvg(f (x)) depends on f (x), which is Sin(x), but Sin(X) does not depend on M ovAvg(f (x)). In the context of networks, the algorithmic-information dynamics of a network is the trajectory of a network moving in algorithmic-information space together with the

page 183

August 2, 2021

184

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

identification of those elements that shoot the network towards or away from randomness. 5.3.3. Causal influence and sublog program-size divergence According to Algorithmic Information Dynamics9, 11 there is an algorithmic causal relationship between two states st and st of a system M and M if |K(Mst ) − K(Ms t )| ≤ log2 (t) + c. That is, if the descriptions of such systems can be bounded by log 2 and a small constant c, then M is most likely equal to M but in some other time state. In other words, if there is a causal influence of st on st+1 or st+1 on st as a system in isolation, their M and M short descriptions should not differ by more than the description of the difference. However, if the descriptions of the states of a system (which may be two systems) in different alleged state times are not causally connected, their difference will diverge beyond above bound. In integrated information, causal influence among its parts is what is claimed to be measured and how different elements of a system can be explained by a single model or the other parts of the system informs us as to how integrated a system may be. A system characterized by large divergence is less integrated compared to a system which evolves with small differences in its respective subpart descriptions. We will suggest that perturbations have to be algorithmic in nature because they need to be made or quantified at the level of the generating mechanisms from the whole or different parts of the integrated system and not at the level of the observations. For example, some n-ary expansions of the mathematical constant π according to Bailey–Borwein–Plouffe (BPP) formulas40 allow perturbations to the digits that do not have any further effect because no previous digits are needed to calculate any other segment of π in the same base. The constant π then can be said to be information disintegrated to the extent of the BPP representations. Algorithmically low complexity

page 184

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

185

objects have low integrated information. Similarly, highly random systems have low integrated information, because perturbations have little to no impact. Integrated information is, therefore, a measure of sophistication that separates high integration from both random and trivially non-random states. 5.3.4. A simplicity versus complexity test With the previous section in mind we can proceed to introduce the idea of φK , where K stands for the letter often used for algorithmic (from Kolmogorov or Kolmogorov–Chaitin) complexity, and φ for the traditional of integrated information theory.7 The measure φK mostly follows methods that Oizumi and Tononi set forth in Ref. [7], where integrated information is measured, roughly speaking, as distances between probability distributions that characterize a MIP (Minimum Information Partition), that is, “the partition of [a system] that makes the least difference”.7 However, the difference between IIT’s φ and φK lies in how φK circumvents what is called the “intrinsic information bottleneck principle”7 that traditionally requires an exhaustive search for the MIP among all possible partitions of a system, a procedure responsible for the fact that integrated information computation requires super-exponential computational resources. In contrast to φ, which follows a statistical approach to estimating and exhaustively reviewing repertoires, the approach to φK is based on principles of algorithmic information. Discovering the simple rules that govern a “discrete dynamical system”8 like those studied in IIT presupposes the analysis of its general behavior in pursuit of a dual agenda: first, to determine its computational capabilities, and secondly to obtain explanations and descriptions of the behavior of the system. As a consequence, one of the major adaptations of IIT is that φK uses the concept of Unconstrained Bit Probability Distribution (UBPD), that is, the individual probabilities associated with a node of a system taking values of 1 (ON) or 0 (OFF) after it has been “fed” all its possible inputs or after all possible perturbations.

page 185

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

186

(D)

(B)

(E) (C)

(A)

Figure 5.4. An example of using UBPC to calculate an unconstrained output distribution. Table 5.1.

Computing UBPD for system shown in Figure 5.1.

Notes: Lines 1, 2: Definition of the system in Figure 5.1 by adjacency matrix (line 1) and dynamics (line 2). Line 3-5: Calculation of individual probabilities that each node of the system will take values 0/1 across the whole output repertoire. Results square: time of computation in seconds and UBPD distribution.

In the context of φK , UBDP is estimated by approximating the algorithmic complexity of the Transition Probability Matrix (TPM) to compute IIT’s unconstrained/constrained probability distributions. In Figure 5.4 and Table 5.1 the concept UBPD and its calculus is explained, using the example used by Oizumi et al.7 In order to explain the notion of UBPD, in Figure 5.4 we use Oizumi’s example used in Ref. [7] to calculate information integration. Figure 5.4(A) shows the network representation: three nodes fully connected with different types of operation executed on its inputs, that is, for example, inputs to node A (coming from

page 186

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

187

B and C nodes) will be processed in a logical OR operation. In Figure 5.4(B), the adjacency matrix that represents the same same network is shown. This adjacency matrix uses the number 1 to indicate if a node receives signals (inputs) for another node. For example, the first row in the adjacency matrix indicates that node 1 or A receives inputs from nodes B and C, denoted as nodes 2 and 3. Finally, Figure 5.4(C) shows the full input and output repertoires, that is, for the full set of all possible inputs to this system, all corresponding outputs are calculated according to the logical operations defined. Table 5.1 shows code for computing UBPD for the system in Figure 5.4. This computation starts with the specification of the adjacency matrix (line 1) and internal dynamic (line 2) of the target system. Then, lines 1 and 2 in Table 5.1 represent code to network specified in Figures 5.4(A) and 5.4(B). In the IIT approach, the system is perturbed with all possible inputs to obtain the full output repertoire (Figure 5.4(C)). Then, in the context of φK , UBPD corresponds to the distribution of probabilities that each node will take values 0/1 in the output/input repertoires after the perturbation. For instance, in Figure 5.4(A), full input and output repertoires are shown for network in Figure 5.4(A). Now, let’s say we want to compute the future probability distribution, that is, the probability necessary to compute effect information according to Ref. [7]. In this case we take output repertoire as a reference and we compute the probability of nodes in the future (outputs) taking the values 0 or 1. For node A, for example, the probability that node A takes the value of 1 is 0.25, that is 2/8, and that it takes the value of 0 is 0.75 or 6/8. These values are called the UBPD for node A. A resume of UBPD for all nodes is given in Figure 5.4(B). Once UBPD is computed for a subject partition, in this case the full system’s probability distribution is computed by multiplying UBDPs. Let’s take as an example the future probability of input {0, 0, 0}, computed as P (A) = 0 ∗ P (B) = 0 ∗ P (C) = 0, that is, 0.25 ∗ 0.75 ∗ 0.5 (see first row in Figure 5.4(C)). When all future probabilities are computed in this manner, the result is the

page 187

August 2, 2021

188

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

distribution shown in Figure 5.4(D), which is exactly the same one computed in Ref. [7], as shown in Figure 5.4(E). In general, UBDP is used to compute probability distributions of a system in the context of φK , which mirrors the “constrained/unconstrained probability distributions” in Ref. [7], that is, probability distributions of input/output patterns for specific configurations (partitions) of the system, in contrast to what IIT 3.0 does. In this last case, Mayner shows how probability distributions are computed in the context of IIT in his S1 text mentioned in Ref. [8], using terms such as “marginalization” and “virtual elements” that seem to be highly complex methods. Then, in the context of φK , UBDP aims to obtain the same results in terms of probability distributions, in a manner equivalent to IIT but by following a different conceptual approach. Our measure φK uses adapted methods, having algorithmic complexity as a background, to compute information integration. In Table 5.1, lines 3–5 show Mathematica code that computes UBDP for the system specified in lines 1 and 2, that is, by means of an adjacency matrix and an array of computations that nodes perform, or the system dynamics. Table 5.1 also shows results of this computation in this order: (1) time needed to compute, followed by probability that a node take the value zero (zeroProb) or the value one (oneProb). One can see how the results in Table 5.1 correspond to UBDP values shown in Figure 5.4(B). We should note that for φK , computation of probability distributions seems to be a task of counting, which for huge systems would be extremely difficult or even impossible, if attempted in a classical/brute force way. But, two important facts should be pointed out here: (1) In the context of φK , UBPD is not calculated in this traditional way, but is calculated using the simulation of the behavior of a system represented by a set of simple rules. Then for φK , an exhaustive review of repertoires is not needed to compute the individual probabilities shown in Table 5.1, and (2) despite strong theoretical and methodological differences between them, φK and φ lead to the same results. In the next sections, we derive simple rules of a system, using the perturbation test and its application to implement φK .

page 188

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

189

5.4. Numerical Results 5.4.1. Compression sensitivity as informative of integration To understand the relationship between IIT and algorithmic complexity, we shall briefly move away from the case of networks and focus on binary files and the binary programs that may generate them, the programs that are natural computable candidate models explaining the data. To illustrate the connection, let us take some extreme cases. Let’s say we have a random file:

So, using the Compress algorithm, the resulting compressed object is even longer, this is because the compression algorithms insert the decompression instructions together with the checksum which ends up increasing the size of the resulting object if the object was not long and compressible enough to begin with. This is what happens if we perform a couple of random perturbations to the uncompressed file:

page 189

August 2, 2021

190

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

The difference between the original and perturbed files is

The files only differ by 2 characters, which can be counted using the following code:

That is, 2/100 or 0.02%. On the other hand, let’s take a simple object consisting of the repetition of a single object, say the letter e:

A shortest program to generate such a file is just:

In other languages this could be produced by an equivalent “For” or “Do-While” program. We can now perturb the program again, without loss of generality. Let’s allow the same two perturbations to the data only, and not to the program instructions (we will cover this case later). The only places that can be modified are thus “e” or 1 instead of 5, say: Table[“a”,500]

Now, the original and decompressed versions differ by 500 elements, and not just a small fraction (compared to the total program length) as in the random case. This will happen in the general case with random and simple files; random perturbations will have a very different effect on each case.

page 190

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

191

An object that is highly integrated among its parts means that one can explain or describe part of each part with some other part when the object is algorithmically simple; then these parts can be compressed by exploiting the information that the said other parts carry over from yet others, and the resulting program will be highly integrated only if the removal of any of these parts has a nonlinear effect on its generating program. In a random system, no part contains any information about any other, and the distribution of the individual algorithmic-content contribution of each element is a normal distribution around the mean of the algorithmic-content contributions, hence poorly integrated and trivial. So integrated information is a measure of sophistication, filtering out simple and random systems, and only ascribing high algorithmic information content to highly integrated information systems. The algorithmic information calculus thus consists of a 2-step procedure to determine: 1. The complexity of the object (e.g., string, file, image) 2. The elements in that object that are less, more, or not sensitive to perturbations that can “causally steer the system,” that is, causally modify an object in a surgical algorithmic fashion rather than on the basis of guesswork based on statistics. Note that this causal calculus is semi-computable, and one can perform guiding perturbations based upon approximations.9, 11 Also note that we did not cover the case in which the actual instructions of the program were perturbed. This is actually just a subcase of the previous case, that separates data from program. For any program and data, however, we conceive an equivalent Turing machine with empty input, thus effectively embedding the data as part of its instructions. Nevertheless, the chances of modifying the instruction Print[] in the random file case are constant, and for the specific example are: 7/107 = 0.0654. While for the non-random case, the probability of modifying any piece of the Table[] function is: 8/12 = 0.666667. Thus, the break-up of a program of a highly causally generated system is more likely under random perturbations.

page 191

August 2, 2021

192

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

Notice similarities to a checksum for, for example, file exchange verification (e.g., from corruption or virus infection for downloading from the Internet), where the data to be transmitted is a program and the data block to check is the program’s output file (which acts as a hash function). Unlike regular checksums, the data block to check is longer than the program, and the checking is not for cryptographic purposes. Moreover, the dissimilarity distance between the original block (shared information) and the output of the actual shared program provides a measure of both how much the program was perturbed and the random or non-random nature of the data compressed by the program. And just like checksums, one cannot tamper with the program without perturbing the block to be verified (its output), without significantly changing the output (block) if what the program has encoded is non-random and therefore causally/recursively/algorithmically generated. Of course all the theory is defined in terms of binary objects, but for purposes of illustration and with no loss of generality we have shown actual programs operating on larger alphabets (ASCII). And we also decided to perform perturbations on what seems to be the program data and not the program itself (though we have seen that this distinction is not essential) for illustration purposes, to avoid the worst case in which the actual computer program becomes non-functional. Yet, this means that the algorithmic calculus is actually more relevant, because it can tell us which elements in the program break it completely and which ones do not. But what happens when changes are made to the program output and not the program instructions? Say we exchange an arbitrary e for an a in our simple sequence consisting of a single letter, for example, the third entry (“a” for “e”): If we were to look to the generating program of the perturbed sequence, this would need to account for the a, for example,

page 192

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

193

where the second program is longer than the original one, and has to be, if the sequence is simple, but the program remains unchanged if the file is random because the shortest program of a random sequence is the random sequence itself, and random perturbations keep the sequence random. Furthermore, every element in the simple example consisting of repetitions of e has exactly the same algorithmic content contribution when changed or removed, as all programs after perturbation are of the form:

Note also how this is related to φ and possibly any measure of integrated information based on the same principles. We can now apply all these ideas to the language of networks, with respect to which IIT has, for the most part, been defined. We have shown before that networks with different topologies have different algorithmic complexity values,41 in accordance with the theoretical expectation. In this way, random ER graphs, for example, display the highest values, while highly regular and recursive graphs display the lowest.42 Some more probabilistic, but yet recursively generated graphs are located between these two extremes.43 Indeed, the algorithmic complexity K of a regular graph grows by O(log N ), where N is the number of nodes of the graph, as in a highly compressible complete graph. Conversely, in a truly random ER graph, however, K will grow by O(log E), where E is the number of edges, because the information about the location of every edge has to be specified. In what follows we will perform some numerical tests strengthening our analytic derivations. 5.4.2. Finding simple rules in complex behavior A perturbation test is applied to systems which IIT is interested in. The set of answers is analyzed in order to find the rules that (1) make it possible to simulate the behavior of the system, (2) define their computability power, that is, rules that give an account of what the

page 193

August 2, 2021

194

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

system can and cannot compute, and (3) rules able to describe and predict behavior of the same system. The following procedure was applied to estimate φK . 1. The perturbation test was applied to systems used in IIT to obtain detailed behavior of the systems. 2. Results in step one were analyzed in order to reduce the dynamics of a system to a set of simple rules. That is, in keeping with the claims of natural computation, we found simple rules to describe a system’s behavior. 3. Rules found in step 2 were used to generate descriptions of what a system is or is not capable of computing and under what initial conditions, without having to calculate the whole output repertoire. 4. A combination of rules found in steps 2 and 3 was used to develop procedures for predicting the behavior of a system, that is, whether it is possible to have reduced forms that express complex behavior. Knowing what conditions are necessary for the system to compute something, it is possible to pinpoint where in the whole map of all possible inputs (questions) of a system such conditions may be found. 5. Once rules in steps 2 and 3 are formalized, φK was turned into a kind of interrogator whose purpose was to ask questions of a system about its own computational capabilities and behavior. This kind of analysis allowed us to find that the information distribution in the complex behavior of systems analyzed in IIT followed a distribution replicated at several scales that is usually and informally identified as “nested” or “fractal”, and means that it is susceptible of being summarized in simple rules by iteration or recursion, just as is the case with fractals proper. These properties are used to find compressed forms to express answers given by a system when asked for explanations of its own behavior. This means that, as noted before, φK does not compute the whole output repertoire for a system but uses simple rules to express the whole behavior of the system. Interestingly, the way in which

page 194

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

195

we proceed appears to be connected to whether or not the system itself can explain its behavior, or rather whether it can see itself to be capable of producing its behavior from an internal experience (configuration) which is then evaluated by an observer. So φK takes the form of an automatic interrogator that, in imitation of the perturbation test, asks questions of the form are you capable of this specific configuration? (pattern), and if so, say where, in the map of the behavioral repertoire, I can find it. The benefit of representing systems using simple rules is that it allows an alternative calculation closer to algorithmic complexity and the potential to reduce the number of calculations to derive an educated estimation as compared to the original version of IIT 3.0. At this point, it is not possible to explain how simple rules define a system in the context of φK without talking about the pattern of distribution of information in the behavior of systems like those studied in IIT. 5.4.3. Simple rules and the black pattern of distribution of information As shown in Ref. [34], despite deriving from a very simple program, without knowing the source code of the program, a partial view and a limited number of observations can be misleading or uninformative, illustrating how difficult it can be to reverse-engineer a system from its evolution in order to discover its generating code.34 In the context of IIT, when we talk about a complex network we find that there are different levels of understanding complex phenomena, such as knowing the rules implemented by each node in a system and finding the rules that describe its behavior over time. To achieve the second, as perhaps could be done for the “whole [of] scientific practice”,34 we found it useful to perform perturbation tests in order to deduce the behavior of the subject systems. Results were analyzed and a pattern in the distribution of information was found to characterize the behavior of these kinds of systems. Then, as was to be expected, replicating behaviors were amenable to being expressed with simple formulae.

page 195

August 2, 2021

16:18

196

Figure 5.5.

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

Seven-node system. Adjacency matrix and network representation.

In order to explain how simple rules were found and implemented in φK , consider as an example the 7-node system shown in Figure 5.5 whose behavior is computed by perturbing the system on all possible inputs. The results, or the whole output repertoire, is shown in Table A.1 in Appendix A. The strategy adopted to find rules that govern a system’s behavior is the same used in almost any branch of science, which is to say we separately observe the behavior of some of the components of a phenomenon, in this case nodes, while bearing in mind that this behavior is not isolated but rather the by-product of interacting elements, or in other words, we observe individual behaviors without losing sight of the whole. When we observe the whole behavior of the system shown in Figure 5.6 (see Appendix A, Table A.1), we notice mostly chaotic behavior but with subtle repetitions of certain patterns. When the behavior of elements is isolated, the picture appears clearer. For example, Table 5.2 shows the isolated behavior of nodes {4} and {5} of the same subject system. In Table 5.2, the isolated behavior of two nodes of the system in Figure 5.6 is shown, where it is possible to observe that isolated behaviors for {4} and {5} follow a sort of order. Such patterns are summarized in what we call behavior tables, shown in Figure 5.6. Lowest rows in behavior tables shown in Figure 5.6 (within braces) correspond to a compressed representation of behaviors shown in Table 5.2.

page 196

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

197

Figure 5.6. Behavior tables for seven-node system in Figure 5.5. (A) node 4, (B) node 5. From left to right, Node column lists input-nodes that feed the target node. node − 1 = power column computes the power used to transform a pattern in the world of the 7-node systems from binary to decimal. 2ˆpow column is the result of the binary to decimal transformation operation. The fourth column contains divisions between elements indexed by n + 1 in column 3 divided by the element indexed as n. Table 5.2. Isolated outputs for nodes {4} and {5} in system introduced in Figure 5.6 after perturbation.

Compressed expressions of behavior for node {4}, for instance, means: 85 repetitions of digit 0, followed by a pattern repeated two times, this pattern being: 1 once, followed by one 0 (that is {1->1, 1->0}). This last pattern is followed for four digits 0, and so on.

page 197

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

198

Figure 5.7.

Network example with nine nodes.

Here, note that the first number 85 is the sum of the numbers shown in the third column of its behaviour table. For node 5, a compressed representation of the behavior means: four times digit 0, followed by four digits 1. This pattern is followed by eight repetitions of digit 1. The last pattern is followed for 94 (32 + 64) repetitions of digit 1. The representation used in this isolation of behaviors is expressed in terms of the nodes that “feed” into target nodes of this example (Node column), namely nodes {4, 5} whose inputs, according to Figures 5.6 and 5.7 are: for node {4}: {1,3,5,7}, and for node 5: {3,4,6,7}. This first shallow analysis works to yield the intuition that the behavior of an isolated node can be expressed as a series of regularities in terms of its inputs. In this context, intuition tells us that the greater the number of regularities, the shorter the description; then if no patterns are detected the chances of a causal relationship are lower. Perspective changes when rule/algorithm or compressed expression of behavior is not constructed from regularities identified at first sight, but from intrinsic algorithmic properties. In this latter case, behavior of systems can be expressed as patterns of information with a distribution replicable at different scales, what we here call fractal representation or fractal behavior. To explain what we mean

page 198

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

199

Table 5.3. φK asking for accounts of information distribution in behavior of 4th node of the system shown in Figure 5.6. Lines 1–8: Definition of the 7-node system by means of adjacency matrix and its internal dynamics. Line 10: φK ’s code asking for zero digit location in the whole behavior of node 4. Line 12: Compressing answer given by the system in line 10. Lines 1 and 2 in results square: Compressed form of the 0 digit distribution in the behavior of node 4. The second grey square above shows the unfolded answer of the system.

by fractal, we introduce characteristics of distribution of information for the 7-node system shown in Figure 5.5 analyzed using φK . This implementation is shown in Table 5.3. Table 5.3 shows how behavior of the system shown in Figure 5.6 can be expressed as simple rules following an analysis based on a querying scheme that results in a reduced form to express its information distribution as a pattern replicated at different scales or as a fractal form. Answers given by systems join facts explored

page 199

August 2, 2021

200

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

above on regularities and the fractal distribution of information. It is important to note that the querying scheme has to be computable and algorithmically random in order to avoid introducing an artificially random-looking behavior from the observer (experimenter/interrogator) to the observed (the system in question). In Table 5.3, after defining the target system by means of an adjacency matrix and a dynamics vector (lines 1 to 8), φK can be regarded as testing: how 0 is distributed in node 4 in the system of seven nodes (line 10). The target system reacts to the φK ’s query and it “answers” in a compressed form (Table 5.3, second part, lines 1 and 2). The result can be represented in compressed form, expressed as a tiny rule that represents what we have called a fractal pattern. Such an expression is defined, as can be seen in the second square in Table 5.3, by two variables: DecimalRepertoire that holds points distanced in different proportions where the patterns defined by the Sumandos variable must be reproduced. This means that in order to unfold the whole distribution (of digit zero), the pattern of numbers in Sumandos must be added to each value in DecimalRepertoire. Once this ‘fractal’ simple rule is unfolded, we obtain the ordinal places where, in the whole behavior of node 4, digit 0 can be found (see third square in Table 5.3). The accuracy of this answer can be verified by counting ordinals where, for node 4, its output = 0 in Table A.1 in Appendix A, taking into account that counting starts at 0. In summary, φK is turned into a kind of interrogator that asks a system about its own behavior. On the other hand, a system is implemented as a set of rules that answers in different ways, depending on the information requested. This is unlikely with traditional approaches to φ, whose representation of the system consists of the whole output repertoire of the system, which might represent an important disadvantage when large networks are analyzed. φK ’s answers use compressed forms taking advantage of the fractal distribution of the information in the behavior of the system, for which the answering interface is a function of its input related to each node in question.

page 200

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

201

Obviously the whole behavior of a system is not about isolated elements, but about elements interacting in a nonlinear manner, as IIT 3.0 makes clear. This last, broader view is also addressed in terms of φK , and explained in the following sections. In the next one, the advantages of simple rules over classical/naive approaches based on an exhaustive calculus and review of whole repertoires held in memory will be established. 5.4.3.1. Automatic meta-perturbation test It can be seen that this querying system is similar to the programmability tests suggested in Refs. [9, 29, 35] based on questions designed to ascertain the programmability of a dynamical system. The last section shows that systems implemented as simple rules that give rise to complex behavior enable the system itself to “respond” to questions about where, in the chain of digits that conform to its behavior (of a specific node), a certain pattern is to be found. And the fractal nature of information distribution in behavior allows us to answer complex distribution questions in short forms. In this section, we show the advantages of using an (automatic) perturbation test based on simulation of behavior using simple rules over the original version of IIT 3.0 based on the “bottleneck principle”7 in computing integrated information. Taking up the original perturbation test, questions take the form: what is the output (answer) given this query (input)?. But in φK , since questions look for explanations of the behavior of the system itself, they take the form: tell me if this pattern is reachable, and if so, tell me where, in the behavioral map, it is possible to find it. An example of how φK implementations turn into an automatic interrogator is shown in Table 5.4, which aims to analyze the system networks shown in Figure 5.6. In line 2, in the list code shown in Table 5.4, it is possible to see how φK asks questions of a system. This line should be interpreted as, Can you compute the pattern {8,9} = {1,1} when {8,9}-> {“OR”, “AN D”}? If yes, tell me under what conditions you can do so. In this example, in the first place φK tries to find conditions needed to compute a specific output. As Table 5.4 shows,

page 201

August 2, 2021

202

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

Table 5.4. φK algorithmic querying of the system about its own behavior as shown in Figure 5.6. Line 2: Query: Is it possible for this system to compute {8,9} = {1,1} when {8,9}-> {“OR”, “AN D”} and whose input nodes are {1,3,5,6} and {1,5,7,3}, respectively?. The results show that the system does compute it when {1,3,5,6,7} = {{1,1,1,0,1},{1,1,1,1,1}}.

the answer is: Yes! I can. This may happen when {1,3,5,6,7} = {{1,1,1,0,1},{1,1,1,1,1}}. In this answer {1,3,5,6,7} is the set of inputs to the subsystem {8,9}. The reader would note here that the answers offered by the system actually are conditions or inputs needed by the system to compute specific input in a format equivalent to Holland’s schemas. The schemas’ equivalent form for this case would be: {{1,*,1,*,1,0,1,*,*}, {1,*,1,*,1,1,1,*,*}}, where ‘*’ is a wildcard that means 0/1 (any symbol). Such schemas correspond exactly to the generalized answer offered by the system, that is: {1,3,5,6,7} = {{1,1,1,0,1},{1,1,1,1,1}}. This answer, like the Holland’s schema theorem,44 works by imitating genetics, where a set of genes are responsible for specific features in phenotypes. What φK retrieves is the general information that yields specific inputs for the current system. Probably the greatest advantage of the approach using φK in querying samples has to do with the computation time needed to retrieve such information, as compared with a traditional (brute force) approach: 1/10 in this case (for results of brute force calculation see Table A.2). This last suffices as proof that compression and generalization of systems in the form of simple rules based on naturally fractal information distribution has advantages over common sense or classical

page 202

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

203

approaches to the analysis of complex systems, particularly in terms of the computational resources needed to compute integrated information. All the above were applied to analyzing isolated or very simple cases. In the next section the generalization of n nodes of the system is addressed, and how this works to compute integrated information according to IIT. 5.4.3.2. Shrinking after dividing to rule In previous sections it was shown how φK , applying a perturbation test, can deduce, firstly, what a system is capable of computing and the conditions under which a computation could be performed, and secondly, that by means of simple rules specifying a system it is possible to obtain descriptions of its behavior in the form of rules that say how information is distributed, or in other words, where, in ordinal terms, such conditions can be found. The ultimate objective of obtaining this kind of description of the behavior of a system is to know how many times specific patterns appear in whole repertoires, and thus to construct probability distributions without need of exorbitant computational resources, since these probability distributions are a key piece used by IIT to compute integrated information. φK addressed such challenges using a two-pronged strategy consisting firstly of parallelizing the analytical process — which is no more than a technical strategy available to be implemented in almost any computer language and that falls beyond the scope of this paper — and secondly of the partition of the target sets. This latter part of φK ’s strategy consists of two parts: (1) given a target set to be analyzed, to divide this into parts to be interrogated by φK via the implementation of an automatic test, and (2) to find the MIP or the Maximal Information Partition using the algorithm proposed and proved by Oizumi.45 In the context of φK , when a partition of a subject system is being analyzed, the search space for the remaining parts is significantly reduced, facilitating and accelerating the analysis of the remaining parts.

page 203

August 2, 2021

204

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

Table 5.5. Comparing processing time when a system is divided to compute outputs. Line 10-14: φK asking the system defined in lines 1–9 for patterns filled with zeros with different lengths (3, 4 and 7) and combinations. Lines 1–5 in results square shown, time in seconds taken for computations and answers in terms of indexes using compressed notation. In first data of results square it can be observed that the larger the node wanted, the greater the amount of time required to perform the computation, while the time ratio decreases.

In order to illustrate this idea, take for example Table 5.5. Table 5.5 shows the definition of a system of 7 nodes (lines 1–9), where a set of a progressively growing length is searched (lines 10– 14). In this example φK repeatedly asks the system if it is capable of finding a growing pattern of zeros. If it is, the system is requested to show where it is possible to find the desired pattern. Obviously, larger patterns need more computations, but as can be seen in Table 5.5, in the results square, the time used by φK increases as the pattern’s

page 204

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

205

length increases (Table 5.5, results square, lines 1–5), but it grows linearly in contrast to IIT 3.0, where it grows exponentially. 5.4.4. Limitations When we visualize the behavior of a system (or subsystem like an isolated node), and take into account its implementation, from the point of view of optimization of computational resources, running rules to generate the behavior of the whole is still a challenge because it is an expensive process in terms of time and memory. Hence for large systems, analysis based on exhaustive reviews of such behavior could eventually become intractable. In order to overcome this limitation, φK attempted to find rules that not only give an account of the computability capabilities of a system, but also describe its own behavior. In other words, we wanted to know about possibilities for finding “shortcuts to express the behavior” of a whole system. One other obvious limitation inherited from computability and algorithmic complexity is that of the semi-computability of the process of trying to find simple representations of behavior. However, we are not required to find the shortest (simplest) one but simply a set of possible short (simple) ones, which would be an indication of the kind of system we are dealing with. While one can find shorter descriptions using popular lossless compression algorithms, the more powerful the algorithms to find shortcuts and fractal descriptions, the faster the computation and the more telling the results, something that is to be expected for a relationship between the way in which integrated information is estimated, on the one hand, and algorithmic complexity. 5.5. Conclusions and Future Directions Here, we have sketched connections, provided pseudo algorithms (Appendix B), and developed first approaches towards a calculus to φ metrics based on algorithmic perturbation analysis, which in turn has a solid mathematical foundation that must be further studied. Our computational approach targeted what is referred to

page 205

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

206

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

as the IIT 3.0, defined as a calculus of probability distributions. Instead of considering distances between statistical distributions, we formulated the problem as a distance in an algorithmic complexity space, properly approximated, in response to perturbations of the system and introduced a meta-test whose answers may provide a guidance on the algorithmic complexity and integrated information of the system. More exploration of the theoretical and practical connections between these theories are still needed. Interestingly, such a perturbation programmability test — initially inspired by the Turing test (establishing another interesting connection between these new theories of consciousness and past ones) — as applied to physical systems, is a working strategy to find explanations for the behavior of systems. It remains for future work to make conceptual and computational connections to what Oizumi and Tononi et al. called the Minimum Information Partition (MIP)7 of a system. Having this first version of φK , we conjecture that MIP definitions also obey and are connected to algorithmic complexity in about the same way, as they should remain based on rules of an algorithmic nature. Thus, the next step is to go further in the application of the test introduced in this paper to discover simple rules that would help to find MIP in a more natural and a faster way. Another possible direction is to systematize the finding of these simple rules and apply more powerful methods to enable computation of larger systems. However, here we have merely established the first principles and the directions that can be explored following these ideas. Finally, we think that these ideas about self-explanatory systems capable of providing answers to questions about their own behavior can help in devising techniques to make other methods, in areas such as machine and deep learning, explain their own, often obscure, behavior. References 1. S. H. Strogatz, Nonlinear Dynamics And Chaos: With Applications To Physics, Biology, Chemistry, And Engineering (Studies in Nonlinearity) (CRC Press, 2000).

page 206

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

207

2. Peter Dayan and L. F. Abbott, Theoretical NeuroscienceComputational and Mathematical Modeling of Neural System (MIT Press, 2001). 3. G. David Chandler, Introduction to Modern Statistical Mechanics (Oxford University Press, 1987). 4. A.-L. Barabasi and M. Posfai, Network Science (Oxford University Press, 2016). 5. M. M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, and P. Vandergheynst, Geometric deep learning: Going beyond euclidean data. IEEE Signal Process. Magaz. 34(4), 18–42 (2017). 6. P. W. Anderson, More is different. Science 177(4047), 393–396 (1972). 7. M. Oizumi, L. Albantakis, and G. Tononi, From the phenomenology to the mechanisms of consciousness: Integrated information theory. 3.0. PLoS Comput. Biol. 10(5), e1003588 (2014). 8. W. G. P. Mayner, W. Marshall, L. Albantakis, Graham Findlay, R. Marchman, and G. Tononi, PyPhi: A toolbox for integrated information theory. arXiv:1712.09644 [cs, q-bio] (2017). 9. F. Marabita, Y. Deng, S. Elias, A. Schmidt, G. Ball, J. Tegnér, H. Zenil, and N. A. Kiani, An algorithmic information calculus for causal discovery and reprogramming systems. Science S2589-0042(19)30270-6, (2019). 10. H. Zenil, N. A. Kiani, and J. Tegnér, Algorithmic information dynamics of emergent, persistent, and colliding particles in the game of life. In A. Adamatzky (ed.) From Parallel to Emergent Computing (Taylor & Francis/CRC Press, 2019), pp. 367–383. 11. H. Zenil, N. A. Kiani, and J. Tegnér, Algorithmic Information Dynamics: A Computational Approach to Causality in Application to Living Systems (Cambridge University Press, 2020). 12. M. Rosanova, M. Boly, S. Sarasso, K. R. Casali, S. Casarotto, M.-A. Bruno, S. Laureys, G. Tononi, A. G. Casali, O. Gosseries, and M. Massimini, A theoretically based index of consciousness independent of sensory processing and behavior. Sci. Transl. Med., 5(198), 198ra105 (2013). 13. A. N. Kolmogorov, Three approaches to the quantitative definition of information. Prob. Inform. Trans. 1(1), 1–7 (1965). 14. G. J. Chaitin, On the length of programs for computing finite binary sequences. J. ACM (JACM) 13(4), 547–569 (1966). 15. J.-P. Delahaye and H. Zenil, Numerical evaluation of algorithmic complexity for short strings: A glance into the innermost structure of randomness. Appl. Math. Comput. 219(1), 63–77 (2012). 16. F. Soler-Toscano, H. Zenil, J.-P. Delahaye, and N. Gauvrit Calculating kolmogorov complexity from the output frequency distributions of small turing machines. PLoS ONE 9(5), 1–18 (2014). 17. H. Zenil, S. Hern´ andez-Orozco, N. A. Kiani, F. Soler-Toscano, A. RuedaToicen, and J. Tegnér, A decomposition method for global evaluation of shannon entropy and local estimations of algorithmic complexity. Entropy 20(8), 605 (2018).

page 207

August 2, 2021

208

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

18. C. R. Shalizi, and J. P. Crutchfield, Computational mechanics: Pattern and prediction, structure and simplicity. Journal of Statistical Physics, Springer, 104(1), 817–879 (2001). 19. H. Zenil, N. A. Kiani, and J. Tegnér, Low algorithmic complexity entropydeceiving graphs. Phys. Rev. E 96(012308) (2017). 20. H. Zenil, Algorithmic data analytics, small data matters and correlation versus causation. In M. Ott, W. Pietsch, J. Wernecke (eds.), Berechenbarkeit der Welt? Philosophie und Wissenschaft im Zeitalter von Big Data (Computability of the World? Philosophy and Science in the Age of Big Data) (Springer Verlag, 2017), pp. 453–475. 21. H. Zenil, L. Badillo, S. Hern´ andez-Orozco, and F. Hern´ andez-Quiroz, Codingtheorem like behaviour and emergence of the universal distribution from resource-bounded algorithmic probability. Int. J. Parallel Emergent Distrib. Syst. (2018). 22. T. Schreiber, Measuring information transfer. Phys. Rev. Lett. 85(2), 461– 464 (2000). 23. C. W. J. Granger, Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37, 424–438 (1969). 24. P. L. Williams and R. D. Beer, Nonnegative decomposition of multivariate information (2010), arXiv:1004.2515. 25. P. L. Williams and R. D. Beer, Generalized measures of information transfer (2011), arXiv:1102.1507. 26. V. Iapascurta, Detection of movement towards randomness by applying block decomposition method to a simple model of circulatory system. Comp. Syst. 28(1), (2019). 27. M. Ventresca, Using algorithmic complexity to differentiate cognitive states in fMRI. In Aiello L., Cherifi C., Cherifi H., Lambiotte R., Li´ o P., Rocha L. (eds.), Complex Networks and Their Applications VII. COMPLEX NETWORKS 2018. Studies in Computational Intelligence, vol. 813 (Springer, 2018). 28. G. Ruffini, An algorithmic information theory of consciousness. Neurosci Conscious. 1 (nix019) (2017). 29. H. Zenil, A behavioural foundation for natural computing and a programmability test. In G. Dodig-Crnkovic and R. Giovagnoli, (eds.), Computing Nature: Turing Centenary Perspective, vol. 7 (Springer SAPERE, 2013), pp. 87–113. 30. J. Pearl, Causality: Models, Reasoning, and Inference (Cambridge University Press, Cambridge, UK, 2000). 31. G. Tononi, M. Boly, M. Massimini, and C. Koch, Integrated information theory: From consciousness to its physical substrate. Nat. Rev. Neurosci. 17 (7), 450–461 (2016). 32. C. Villani, Topics in Optimal Transportation (American Mathematical Society, Providence, RI, 2003). 33. H. Zenil, A behavioural foundation for natural computing and a programmability test. In Computing Nature (Springer, 2013), pp. 87–113.

page 208

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

209

34. H. Zenil, A. Schmidt, and J. Tegnér, Causality, information and biological computation: An algorithmic software approach to life, disease and the immune system. In P. C. W. Davies S. I. Walker and G. Ellis, (eds.), Information and Causality: From Matter to Life, (Cambridge University Press, 2017), pp. 244–279. 35. H. Zenil, Testing biological models for non-linear sensitivity with a programmability test. In G. Nicosia S. Nolfi P. Li´ o, O. Miglino and M. Pavone (eds.), Advances in Artificial Intelligence (MIT Press, 2013), pp. 1222–1223. 36. S. Devine, The application of algorithmic information theory to noisy patterned strings. Complexity, 12(2), 52–58 (2006). 37. C. R. Shalizi and K. L. Shalizi Optimal nonlinear prediction of random fields on networks. Discrete Math. Theor. Comput. Sci., AB(DMCS), 11–30 (2003). 38. R. J. Solomonoff, A formal theory of inductive inference. Part I. Information and Control, 7(1), 1–22 (1964). 39. R. J. Solomonoff, A formal theory of inductive inference. Part II. Information and Control, 7(2), 224–254. 40. D. H. Bailey, P. B. Borwein, and S. Plouffe, On the rapid computation of various polylogarithmic constants. Math. Comput. 66, 903–913 (1997). 41. H. Zenil, F. Soler-Toscano, K. Dingle, and A. Louis, Correlation of automorphism group size and topological properties with program-size complexity evaluations of graphs and complex networks. Physica A: Statist. Mech. Appl. 404, 341–358 (2014). 42. H. Zenil, N. A. Kiani, and J. Tegnér, A review of graph and network complexity from an algorithmic information perspective. Entropy 20(8), 551 (2018). 43. H. Zenil, N. A. Kiani, and J. Tegnér, Methods of information theory and algorithmic complexity for network biology. Sem. Cell Develop. Biol. 51, 32–43 (2016). 44. J. H. Holland, Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence (U Michigan Press, 1975). 45. J. Kitazono, R. Kanai, and M. Oizumi, Efficient algorithms for searching the minimum information partition in integrated information theory. Entropy 20 (3), 173 (2018). arXiv: 1712.06745. 46. G. Tononi, Consciousness as integrated information: A provisional manifesto, Biol. Bull. 215(3), 216–242. (2008). doi: 10.2307/25470707

Appendix A A.1. How a meta-perturbation test works In order to explain the advantages of the generalization of information in the form of schemas computed by simple rules, Table A.2 is introduced. In this Table are shown all possible cases where the

page 209

August 2, 2021

210

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

Table A.1. Seven-node system and output repertoires. (A) Network definition by adjacency matrix (lines 1–7) and dynamics (line 8). (B) Output repertoire.

pattern {8,9}={1,1} can be found in the whole output repertoire of the system introduced in Figure 5.7 in the main text. On the right side of the set contained in Table A.2, outputs where {8,9} = {1,1} are highlighted in red. On the left side the

page 210

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

211

Table A.2. Comparison between real behavior and unfolded nested rule of behavior of node 4 of the system defined in Figure 5.6 of the main text. The answer offered by the system (second results square) shows the places where digit 0 is found in the behavior of node 4 (first results square).

9-length are inputs that yield to outputs containing the desired pattern. On this left side, in bold, are the corresponding inputs that are particularly responsible for causing the desired pattern, that is, all possible patterns for the inputs {1,3,5,6,7}. In order to obtain the results in Table A.3 using a naive (brute force) approach, it was necessary to define the whole set of all 29 possible inputs and compute the whole set of outputs; then an exhaustive search for {8,9} = {1,1} was carried out. Notice that time and memory used are at least 10 times greater than those used in the φK approach. These results are shown in the last two rows in Table A.3 and the results square in Table 5.4.

page 211

August 2, 2021

212

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

Table A.3. Naive approach to looking for algorithmic patterns based on simplicity vs complexity in the calculation of integrated information. {1,0,1,0,1,0,1,0,0}->{1,1,0,1,0,0,1,1,1}, {1,1,1,0,1,0,1,0,0}->{1,1,0,1,0,0,1,1,1}, {1,0,1,1,1,0,1,0,0}->{1,1,0,1,0,0,1,1,1}, {1,1,1,1,1,0,1,0,0}->{1,1,0,1,0,0,1,1,1}, {1,0,1,0,1,1,1,0,0}->{1,1,0,1,0,0,1,1,1}, {1,1,1,0,1,1,1,0,0}->{1,1,0,1,0,0,1,1,1}, {1,0,1,1,1,1,1,0,0}->{1,1,0,1,0,0,1,1,1}, {1,1,1,1,1,1,1,0,0}->{1,1,0,1,0,0,1,1,1}, {1,0,1,0,1,0,1,1,0}->{1,1,0,1,0,0,1,1,1}, {1,1,1,0,1,0,1,1,0}->{1,1,0,1,0,0,1,1,1}, {1,0,1,1,1,0,1,1,0}->{1,1,0,1,0,0,1,1,1}, {1,1,1,1,1,0,1,1,0}->{1,1,0,1,0,0,1,1,1}, {1,0,1,0,1,1,1,1,0}->{1,1,0,1,0,0,1,1,1}, {1,1,1,0,1,1,1,1,0}->{1,1,0,1,0,0,1,1,1}, {1,0,1,1,1,1,1,1,0}->{1,1,0,1,0,0,1,1,1}, {1,1,1,1,1,1,1,1,0}->{1,1,0,1,0,0,1,1,1}, {1,0,1,0,1,0,1,0,1}->{1,1,0,1,0,0,1,1,1}, {1,1,1,0,1,0,1,0,1}->{1,1,0,1,0,0,1,1,1}, {1,0,1,1,1,0,1,0,1}->{1,1,0,1,0,0,1,1,1}, {1,1,1,1,1,0,1,0,1}->{1,1,0,1,0,0,1,1,1}, {1,0,1,0,1,1,1,0,1}->{1,1,0,1,0,0,1,1,1}, {1,1,1,0,1,1,1,0,1}->{1,1,0,1,0,0,1,1,1}, {1,0,1,1,1,1,1,0,1}->{1,1,0,1,0,0,1,1,1}, {1,1,1,1,1,1,1,0,1}->{1,1,0,1,0,0,1,1,1}, {1,0,1,0,1,0,1,1,1}->{1,1,0,1,0,0,1,1,1}, {1,1,1,0,1,0,1,1,1}->{1,1,0,1,0,0,1,1,1}, {1,0,1,1,1,0,1,1,1}->{1,1,0,1,0,0,1,1,1}, {1,1,1,1,1,0,1,1,1}->{1,1,0,1,0,1,1,1,1}, {1,0,1,0,1,1,1,1,1}->{1,1,0,1,0,0,1,1,1}, {1,1,1,0,1,1,1,1,1}->{1,1,0,0,0,0,1,1,1}, {1,0,1,1,1,1,1,1,1}->{1,0,0,1,0,0,1,1,1}, {1,1,1,1,1,1,1,1,1}->{1,0,1,0,1,1,1,1,1}, 0.166334 217920

page 212

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

Appendix B Algorithm 1: computeIntegratedInformation input : AdjacencyMatrix,Dynamic,CurrentState output: Information Integration Value 1

2

3

4 5 6 7 8 9 10

nodes ← GetNodes(AdjacencyMatrix ); // UPPD: Unrestricted Past Probability Distribution; // UFPD: Unrestricted Future Probability Distribution; UPPD ← ComputesPastProbabilityDistribution(nodes, CurrentState,∅,Dynamic,am); UFPD ← ComputesFutureProbabilityDistribution(nodes, CurrentState,∅,Dynamic,am); // am=AdjacencyMatrix; cs=CurrentState; conceptualSpace ← ComputesConceptualSpace(am,Dynamic,cs,UPPD,UFPD); integratedInformationValue ← 0; bipartitionsSet ← Bipartitions(conceptualSpace); foreach bipartition bi ∈ bipartitionsSet do aux ←EMD(bi , conceptualSpace); if aux > integratedInformationValue then integratedInformationValue ← aux;

Algorithm 2: computeConceptualSpace input : AdjacencyMatrix,Dynamic,CurrentState,UPPD,UFPD output: conceptualStructure

1 2 3 4

5

// UPPD: Unrestricted Past Probability Distribution; // UFPD: Unrestricted Future Probability Distribution; nodes ← GetNodes(AdjacencyMatrix ); mechaSet ←Subsets(nodes); foreach mechanism mechai ∈ mechaSet do OneConcept ←ComputeConceptOfAMechanism(mechai , nodes, CurrentState, U P P D, U F P D); Append(conceptualSpace,OneConcept)

213

page 213

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

214

Algorithm 3: computeConceptOfAMechanism input : mechanism,nodesForPurviews,currentState, pastDistro,futDistro,Dynamic,AdjacencyMatrix output: concept for current mechanism 1 2 3 4 5 6

7

8

9

10

// nodes where all purviews will be taken from purviewsSet ←Subsets(nodesF orP urviews); for j ← 1 to Length(P urviewsSet) do aPurview ← Part(j, P urviewsSet); connected ← FullyConnectedQ(mechanism, AP urview); if connected then // MIP: Maximal Information Partition smallAlpha ← ComputesMIP(mechanism, AP urview) ; // APurviewMIP: Purview responsable to cause MIP for current mechanism; aPurviewMIP ← smallAlpha(”P urviewM IP ”); // Following sum is formalized in Figure 5.4, In Text S2 from Oizumi(2014); // cs=CurrentState; am = AdjacencyMatrix; pastDistribution ← ComputesPastProbabilityDistribution(mechanism, cs, aPurviewMIP, Dynamic, am); futureDistribution ← ComputesFutureProbabilityDistribution(mechanism, cs, aPurviewMIP, Dynamic, am); ConceptualInfo ← EMD(pastDistro, pastDistribution)+EMD(f utDistro, futureDistribution);

page 214

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

Algorithm 4: ComputesMIP input : mechanism,purview output: MIP structure 1 2 3 4 5 6 7

8

9

10 11 12 13 14 15

mechaChildren ← Subsets(mechanism); PurviewChildren ← Subsets(purview ); ci ← 10000; ei ← 10000; foreach mecha mi ∈ mechaChildren do foreach purview pi ∈ PurviewChildren do // cs=CurrentState, am=AdjacencyMatrix; pastDistribution ← ComputesPastProbabilityDistribution(mi ,cs,pi , Dynamic,am); futureDistribution ← ComputesFutureProbabilityDistribution(mi ,cs,pi , Dynamic,am); cei ← ComputesCEI(mecha,purview,pastDistribution, futureDistribution); if cei (''ci'') < ci then ci ← cei (''ci''); pastMIP ← (mecha,purview); if cei (''ei'') < ei then ei ← cei (''ei''); futMIP ← (mecha,purview);

215

page 215

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

216

Algorithm 5: computesCEI input : ChildMecha, ChildPurview, ParentMecha, ParentPurview, ParentPastDistro, ParentFutDistro, UnconstrainedPastDistro, UnconstrainedFutDistro output: Causal and Effect information values 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

17

mechaComplement ← ComplementI(ChildM echa, P arentM echa) purviewComplement ← ComplementI(ChildP urview, P arentP urview) mechaChildren ← Subsets(mechanism) PurviewChildren ← Subsets(purview) foreach mecha mi ∈ mechaChildren do foreach purview pi ∈ PurviewChildren do ChildMecha ← mi ChildPurview ← pi if ChildMecha=∅ then pastDistribution ← UnconstrainedPastDistro futureDistribution ← UnconstrainedFutDistro else if ChildPurview=∅ then pastDistribution ← 1 futureDistribution ← 1 else if then pastDistribution ← ComputesPastProbabilityDistribution(ChildMecha, ChildPurview,cm,am) futureDistribution ← ComputesFutureProbabilityDistribution(ChildMecha, ChildPurview,cm,am)

18 19 20 21 22 23 24 25

26

27

28

29

30 31 32

if mechaComplement =∅ then pastDistributionComp ← UnconstrainedPastDistro futureDistributionComp ← UnconstrainedFutDistro else if purviewComplement =∅ then pastDistributionComp ← 1 futureDistributionComp ← 1 else if then // CPPD=ComputesPastProbabilityDistribution // CFPD=ComputesFutureProbabilityDistribution pastDistributionComp ← CPPD(MechaComplement,PurviewComplement,cm,am) futureDistributionComp ← CFPD(MechaComplement,PurviewComplement,cm,am) pastDistribution ← Normalize(pastDistribution * pastDistributionComp) futureDistribution ← Normalize(futureDistribution * futureDistributionComp) ci ← EMD(ParentPastDistro,pastDistribution) ei ← EMD(ParentFutDistro,futureDistribution) cei ← Min(ci,ei)

page 216

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

217

Algorithm 6: computesPositionsOfAPatternInOutputs input : Mechanism,Purview,AdjacencyMatrix,CurrentState,Dynamic output: positions (indexes) where current state of mechanism is found into the output repertoire // work as nodes that send inputs to the mechanism. // This remaning nodes are actually powers that define a pattern // of distribution of the wanted pattern defined by mechanism 1 2 3 4

joinedNames ← Join(Inputs(M echanism)) powers ← Complement(Range(Length(AdjacencyMatrix )), joinedNames)-1 foreach node ni ∈ powers do Append(sumandos, 2ni )

7

sumandos ←Subsets(sumandos) foreach sumando si ∈ sumandos do Append(Aux, Sum(si ))

8

sumandos ←Aux

5 6

9

10 11 12 13 14 15 16 17 18

ins ← Inputs(FirstNode(Mechanism)) // Given a dynamic it computes all possible inputs of defined size that // results in a defined output (cs) repertoire ← RepertoireByOutput(Length(ins),Dynamic,cs) foreach node mi ∈ (mechanism − FirstNode(mechanism)) do intersec ← Intersection(ins,Inputs(mi )) ins ← ins +(Inputs(mi )-intersec) repertoire ←Combine(repertoire,CreateRepertoire( Inputs(mi )-intersec)) repertoire ←FilterRepertoireByOutput(repertoire,cs) foreach sumando si ∈ sumandos do foreach repert ri ∈ repertoire do Append(indexes, Sum(si , ri ))

page 217

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

A. Hern´ andez-Espinosa et al.

218

Algorithm 7: computesPastProbabilityDistribution input : Mechanism,Purview,AdjacencyMatrix,CurrentState,Dynamic output: Probability distribution for a mechanism 1

2 3 4 5

6

locations ←computesPositionsOfAPattern(M echanism, CurrentState, P urview, AdjacencyM atrix) allInputs ← Extract(locations,Purview ) probability ← 1/(Length(Locations)) foreach input ini ∈ allInputs do correctedLocations ←FindPatternInInputs(P urview, ini , Length(AdjacencyMatrix )) probabilityDistribution ← ComputesProbabilityForElements(correctedLocations)

Algorithm 8: computesFutureProbabilityDistribution input : Mechanism,Purview,AdjacencyMatrix,CurrentState,Dynamic output: Probability distribution for a mechanism 1

2 3 4

locations ←(FindPatternInInputs(P urview, CurrentState, Length(AdjacencyMatrix )))-1 allOutputs ← ComputesOutputs(locations) allInputs ← Extract(locations,Purview ) probabilityDistribution ← ComputesProbabilityForElements(allInputs)

page 218

August 2, 2021

16:18

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch05

Estimations of Integrated Information Based on Algorithmic Complexity

Algorithm 9: findPatternInInputs input : Nodes,WantedPattern,sizeAdjacencyMatrix output: Finds indexes in input repertoire where nodes fullfill wantedPattern 1 2 3 4 5 6 7 8 9 10 11

limit ← 2sizeAdjacencyM atrix foreach node ni ∈ N odes do powers ← 2ni −1 repetitions ← limit/powers longi ← limit/repetitions if if expectedPatt = 1 then serie ← CreateSerieEvenNumbers(repetitions) else serie ← CreateSerieOddNumbers(repetitions) for i = 1; i < Length(serie); i = i + 1 do Found ← Range((( powers * serie [i])-longi)+1, powers * serie [i])

Algorithm 10: computeInputBitProbabilityDistro input : Nodes,AdjacencyMatrix,Dynamics output: Bit probability for given nodes 1 2 3

locations ← FindPatternInInputs(Nodes,1,Length(AdjacencyMatrix )) oneProbability ← Length(locations)/2Length(AdjacencyMatrix ) zeroProbability ← 1 − oneProbability

Algorithm 11: computeOutputBitProbabilityDistro input : Nodes,AdjacencyMatrix,Dynamics output: Bit probability for given nodes 1

2 3

locations ← ComputesPositionsOfAPatternInOutputs(N odes, 1, Dynamics, AdjacencyM atrix) oneProbability ← Length(locations)/2Length(AdjacencyMatrix ) zeroProbability ← 1 − oneProbability

219

page 219

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch06

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 0006

Chapter 6

Robot Narratives Marina Sanz Orell¶ , James Bown∗, , Susan Stepney†,∗∗ , Richard Walsh‡,†† and Alan F. T. Winfield§,‡‡ ∗

†

School of Design and Informatics, Abertay University, UK Department of Computer Science, and York Cross-disciplinary Centre for Systems Analysis, University of York, UK ‡ Department of English and Related Literature, and Interdisciplinary Centre for Narrative Studies, University of York, UK § Bristol Robotics Lab, University of the West of England, Bristol, UK ¶ [email protected] [email protected] ∗∗ [email protected] †† [email protected] ‡‡ Alan.Winfi[email protected]

There is evidence that humans understand how the world goes through narrative. We discuss what it might mean for embodied robots to understand the world, and communicate that understanding, in a similar manner. We suggest an architecture for adding narrative to robot cognition, and an experimental scenario for investigating the narrative hypothesis in a combination of physical and simulated robots.

6.1. Introduction We start from the narrative hypothesis, that humans understand ‘how the world goes’ through narrative: we make sense of more or less complex events through the stories we hear, tell, imagine, and construct: “narrative is our innate way of representing process — it’s the form in which we make sense of stuff happening [. . . ] our cognitive 221

page 221

August 2, 2021

222

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch06

M. S. Orell et al.

framework for representing behavior is narrative” (see Ref. [1, p. 5]). This starting point has led us in two related directions. First, complex systems and their emergent properties, including feedbacks, multi-scale interactions, and tipping-points, appear unnarratable, except by giving the emergent property some form of agency (e.g., evolution and Mother Nature2 ). We explore issues around this challenge of narrating complexity in Ref. [3]. Second, if we wish to communicate with artificially intelligent robots, be they our helpers, workmates, or carers,4 then they need to understand and relate to the world the way we do: through stories. And if they can do so, can they then relate to each other in the same manner? Such questions of robot narratives are the focus of this chapter. Our discussion is structured in three sections. In Section 6.2, we provide a motivating example in the form of contrasting narratives of robots exploring a planet. The first set of robots have narrative understanding of their situation and task; the second set have declarative logical understanding. In Section 6.3, we outline a design for a model of robot narrative understanding, and then discuss two narrative scenarios related to negotiating a flight of steps. In Section 6.4, we suggest a programme of experimental robotics that could be used to develop and explore robot narratives, and to test the narrative hypothesis in robotics. 6.2. Robots Explore an Alien Planet Here, we give a motivating example of “narrative robots” through two contrasting tales of exploration. Two teams of robots are given the same task (Section 6.2.1), that of thriving on an unexplored planet. The “Greek” team have narrative understanding (Section 6.2.2); the “Roman” team use declarative logic (Section 6.2.3). They have different experiences, and we the readers have different reactions to their discussions. Clearly, this is an imagined example, with certain situations exaggerated and foregrounded to make our points (that is, it is a story), but our aim is to illustrate what might be possible with narrative understanding.

page 222

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Robot Narratives

b4205-v1-ch06

223

6.2.1. Prologue Twelve identical intelligent robots awaken on an unexplored planet, standing in a circle. Their knowledge of the territory is limited to the laws of physics and chemistry that are as applicable here as they are on planet Earth, given the planets’ similar atmospheric composition and surface gravity. The robots’ mission is to thrive, to prosper, to attain full potential; they understand what that means in regards to their survival, but not how this new environment will allow it. To succeed, the robots decide that they should cooperate and work as a team. The first step is surviving; for that they need information, fuel, material for maintenance, a base, and other resources. With limited information about the terrain the robots realize that their first task should be reconnaissance. They dissect the terrain in twelve equal sectors centered on their current position. They plan to explore for six hours, then return to share their experiences. Once the six hours have passed, each robot starts heading back to the original meeting place. They arrive back at different times since those robots that encountered fewer challenges went further than others. Some carry more relevant information than others, some have recorded more details about their experiences. Some have encountered a more varied range of phenomena and some have developed more complex ideas and processes. All these differences now mean that the 12 robots are no longer identical: they are separate entities with different knowledge, different priorities, and different skills. Once the 12 are reunited they start relating their discoveries to their companions, through narrative (Greek) or declarative (Roman) means. 6.2.2. Greek Olympus, narrative robots Zeus, Hera, Poseidon, Ares, Aphrodite, Hades, Hephaestus, Athena, Hermes, Artemis, Demeter and Hestia awaken on the unexplored planet Greek Olympus (Figure 6.1), go off exploring, reunite, and start telling the stories of their exploration.

page 223

August 2, 2021

224

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch06

M. S. Orell et al.

Figure 6.1. The 12 Roman (declarative) and Greek (narrative) robots exploring planet Olympus, with some of the features they discover.

Hermes tells about his experience first. This is his story: I started moving in a straight line from my origin. For the first hour of moving at a moderate speed I saw nothing relevant; the ground was sandy and red as it is here. During the next hour I started encountering rocks that varied in size and were very hard but cracked easily. Initially these rocks were scattered, but during my third hour of travelling they started to appear in larger groups and sizes, and I predicted that I was approaching a bigger rock formation. This suspicion was confirmed as I realised that I was entering a canyon. Natural stone pillars of the same red colour towered around me and strong gusts of wind carried grains of sand into my joints, making movement slower and more difficult. Then I encountered a deep

page 224

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Robot Narratives

b4205-v1-ch06

225

narrow gorge that cut across the straight path I had been following. I figured that the least dangerous and risky solution was to move around it to get to the other side. It was slow but safe as expected. I continued moving along the canyon. There were many cracks and holes in the wall that weren’t big enough to fit my body through. The entire formation looked quite fragile, and when I tried testing the malleability of one of the rocks it didn’t sustain much stress before disintegrating. I continued on until the sixth hour struck, and I turned back to return here. I didn’t find any fuel material, and my sector doesn’t offer any good candidates for a settlement location, as the canyon appears dangerous and unstable. Hermes finishes his story; the other 11 robots are listening carefully, processing the story, and noting the main points: • The first sector does not contain fuel sources or settlement locations. • There is a canyon-like geological formation that appears dangerous and unstable. The rocks that form it were hard but brittle. Hermes has warned them off the area. • Gusts of wind might carry sand particles that get stuck in articulations making movement complicated; this is a real danger that Hermes warns about from personal experience. • A successful way to deal with a deep narrow gorge is to walk around it. • Based on Hermes’ tale, the wind and the deep gorge situations are inconvenient but not insuperable. Following Hermes’ tale, it is Artemis’ turn to share her experiences. But Zeus intervenes, before his turn, to tell the others that he has time-sensitive important information: he has witnessed a distant slow-moving thunderstorm during his exploration, and predicts it will reach their current location in about an hour. He says the best course of action is to have shorter reports so they can quickly decide on a settlement location and take shelter from the storm. All the robots agree this is the best course of action, so they give their reports in a shorter form.

page 225

August 2, 2021

226

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch06

M. S. Orell et al.

Hestia describes a location that is less than an hour away and consists of a cliff with deep caves all over its surface, like a rocky beehive. Some of the caves form interconnected tunnels inside the cliff, and the lowest caverns may contain running water. The caves are structurally sound, and there were no noticeable hazards, just a lot of varied vegetation in the area in front of the cliff. Aphrodite encountered few obstacles, and so her report focusses on the biodiversity in her sector: I found my sector to be quite plain and safe. I encountered no obvious hazards or obstacles. I did record various forms of carbon-based organisms akin to vegetation; between the second and fourth hours I was moving across purple fields covered in a specific type of these organisms and also a species of fungal-like life forms. Past these fields there were plains with scattered groups of flora arranged in a bush-like fashion. It was pleasant and safe, and a good source of organic material. It could potentially present a good settlement location. Hestia’s and Aphrodite’s reports are considered alongside Demeter’s and Artemis’ for potential settlement locations. Zeus presents the four options since he is managing this time-sensitive mission. The robots have developed slightly different impressions and perspectives because of their different experiences, but they are still sufficiently similar that there is consensus to declare Hestia’s suggestion as optimal. Hestia gives the group instructions on how to get there: We head directly north. After about 30 minutes, assuming we’re moving fast, we’ll encounter a fairly steep incline. It’s better if we approach it from the right, because the path is smoother there. At the top of the incline there’s a plain that we have to cross, and finally we reach the cliff. We can enter one of the ground level caves without any water and settle there. With these directions, the group of robots starts moving towards the location. As they do, each robot exercises its own simulation as it faces smaller challenges such as adapting to different kinds of soil, and updates its world model. Hermes struggles as his movements are made difficult from the sand still in his joints. Ares has developed a

page 226

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Robot Narratives

b4205-v1-ch06

227

crouched and jerky way of moving after navigating a dense jungle in his sector, and he has to modify his style for the new terrain. Hera is used to taking big strides, from walking through snowy slopes. They all alter their methods of movement, and develop slightly different styles. When they reach the incline, Hestia simulates the best way to climb it as a group, and instructs them to move in single file along the smoother rightward path. They reach the cave where they can take shelter from the approaching thunderstorm. They are all aware they need to decide how to build their settlement, and so they engage in a session of brainstorming. The differing points of view give rise to different ideas and a richer understanding. They extract plans from all their suggestions, based on their different observations and considerations, and vote on how to address the shelter situation and distribute the other tasks. The robots adopt different roles according to their skills. Having developed different areas of expertise and different perspectives about the world and their own selves, each robot grows in different directions. They diversify until they no longer have a unique memory bank that characterizes them all, but a different way of perceiving, processing, story-telling and overall sense-making. This gives rise to individual behaviors and a range of dynamics like leadership and competition, and other forms of social interaction like games and culture and art. We might like to think such behaviors are uniquely human, but we see similar practices emerge in some animal species, such as dolphins and chimpanzees.5 So in our story here, our Greek Olympians go on to develop humor, sport, poetry, and, of course, story telling. 6.2.3. Roman Olympus, declarative robots Jupiter, Juno, Neptune, Mars, Venus, Pluto, Vulcan, Minerva, Mercury, Diana, Ceres and Vesta awaken on the unexplored planet Roman Olympus (Figure 6.1), go off exploring, reunite, and start stating the facts about their exploration.

page 227

August 2, 2021

228

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch06

M. S. Orell et al.

Mercury states his experience first, then all the others follow in turn. Each robot describes their exploration in detail, giving explicit descriptions of everything they have encountered, including the scientific data. For example, Mercury states the details of the wind strengths along his journey, and then separately describes his physical status, noting movement is slightly problematic due to sand in his joints. When they describe a hostile environment, like Mercury’s canyon or Juno’s snow, they give an objective description of the place. Every description is a detailed enumeration that allows each robot to construct an internal map of what is being described. By the end of their round of reports, each robot has an entire map of each sector in their world characterized by the data reported by each robot in turn. Jupiter does not intervene to warn them of the approaching thunderstorm; when his turn arrives, he describes the storm objectively, but it does not carry a sense of urgency. Now they have all shared their experiences they decide that they should act on the thunderstorm information and find a shelter. Their interior models are very similar to each other again, since they have all reported and recorded their respective data. They thus all determine they should go to Vesta’s caves. They all start moving towards the cliff with their excellent virtual maps; there are no unexpected surprises since they know the terrain as if they explored it first-hand. They all climb the incline single file without a need to communicate. The thunderstorm catches up to them before they reach the caves, but fortunately it does not cause any damage. Once in the cave, they do not really need to communicate to decide on a plan, as they are all so similar. They do not establish different roles, they all remain uniform. In being so homogenous, instead of a group of individuals, they all function as one. This is efficient, but it will severely limit creativity in their responses to forthcoming challenges. So in our story here, our Roman Olympians remain ‘robotic’ in their behaviors and communications.

page 228

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Robot Narratives

b4205-v1-ch06

229

6.2.4. Epilogue Dramatis Personae: Computer Scientist (CS); Narratologist (N); Robot (R) CS is applying a screwdriver to R’s head N: What are you doing? CS: I’m trying to make my robot more human-like, by making it speak naturally. R: There is a hole. There is a corner. The hole is around the corner. N: That doesn’t sound very natural! You want to use Narrative Logic. CS: A what? Is that something from that cognitive narratology you’ve been telling me about for years? (And even with us writing that book on it, I still can’t say it!) N: That’s right. We know that people understand the world through narrative — through stories. If you want to make your robot sound more human-like, it needs to talk about the world the way we would. So you need to give it Narrative Logic. Then it can talk about the world using stories. CS: How can I do that? N: How does your robot work? CS: It’s got a model of the world in its head. It builds that from its interactions in the world. It then uses that model to run simulations, and to plan its actions in the world. I was bolting on a declarative grammar module, so that it can make statements about its world model. N: Okay, so it sounds like you need to work with embodied and enactive cognition. Why don’t I bolt on a Narrative Logic module to help with that. N applies a spanner to R’s head R: I went round that corner yesterday. I fell into a hole. I was stuck there all day.

page 229

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

230

b4205-v1-ch06

M. S. Orell et al.

CS: That was easy! N: Well, no, it’s not really that easy. We have to design the narrative logic, and connect it with the robots model. I know about Narrative Logic, but not how to connect it to robots. CS: Well, I know about interfacing to robots, and simulating them. And I know how the model in this one’s head works. R: And I can help you run experiments! CS: Hey, together, we could get robots to understand the world the way we do! We can map out different narrative structures for different problems! One for . . . a human companion. One for . . . talking about legal regulations. One for . . . social learning from each others’ stories. One for . . . N: Hang on, hang on! Before we start all that, we need to see if we can get this one robot to tell stories about its own world. CS: But we could then use this robot to help design those other narrative logics? The robot would be more human-like? R: Oh no! Here I am, brain the size of a planet . . . N: Yeah, yeah . . . if the idea of robots telling stories works in the first place. Shall we work together on that first? And then we can tackle the bigger problem of social narrative, and of robots learning their world through stories, later on. CS: Okay, sounds like a plan. Let’s do it! N: So, where could we get the funding to do that? 6.3. Robot Imaginations How might such “narrative robots” be possible? In this section, we describe a particular architecture of a robot mind that contains a model of the world including itself, where the robot can use this model to simulate scenarios in order to choose among potential courses of actions.6, 7 This architecture has been suggested as a starting point for robots telling stories.8 We then illustrate some potential scenarios of this model in use. 6.3.1. Dennett’s Tower Winfield8 describes a succession of more complex creatures that can be used to design more complex intelligent robots. This is based

page 230

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch06

Robot Narratives

231

on Dennett’s idea of the Tower of Generate-and-Test,9 a conceptual model of levels of intelligence. Dennett’s Tower not only provides us with a powerful model for types of intelligence, but a compelling route toward much more capable socially intelligent robots. On each level of Dennett’s Tower are creatures successively more capable of reacting to and surviving in the world, each having more sophisticated strategies. At the lowest level are (Charles) Darwinian creatures: new individuals are generated by variation from their parent(s), and are tested by selection in the real world; populations ‘learn’ through evolution; individuals do not learn. Next are (B.F.) Skinnerian creatures, who generate possible actions and test them by enacting them in the real world; individuals learn through reinforcing successful behaviors. Next are (Karl) Popperian creatures, who have internal models, in which they generate possible actions, and test them in the imagined world; they discard unsuccessful ones without needing to enact them in the dangerous real world. Finally there are (Richard) Gregorian creatures, who are social learners: they can learn successful behaviors generated and tested by others. 6.3.2. Robots in the tower The majority of present-day robots, including those in research labs, have no mechanisms for learning: their behaviors are predesigned and hardwired. These robots do not even make it to the ground floor of Dennett’s Tower. A small number of research robots, within the subfield of evolutionary robotics,10 use genetic algorithms to evolve new behaviors: these are Darwinian creatures in Dennett’s scheme. Another small set of research robots use reinforcement learning approaches11 : these are Skinnerian. A handful of research robots have employed self-simulation embedded within the robot, to create Popperian robots.12–17 6.3.3. An architecture for an ethical Popperian robot Winfield and co-workers6, 7, 18, 19 describe an “ethical” robot, ethical in that it may choose actions that compromise its own safety in order to prevent another from coming to harm. This ethical robot has an embedded simulation of itself, other dynamic actors (robots),

page 231

August 2, 2021

232

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch06

M. S. Orell et al.

Figure 6.2. Popperian robot model, with interior generate and test loop (adapted from Ref. [8, Fig. 4.1]).

and its currently perceived environment. This simulation is used as a real-time consequence engine capable of modeling, evaluating and weighting next possible actions against safety and ethical rules. The model is summarized in Figure 6.2. On the right is shown the standard “sense-plan-act” robot system. On the left is its additional Popperian “interior model” (IM), where a loop generates potential actions; these are simulated in the robot/world model in the context of the real perceived environment (sensor input); the simulated results are tested by the consequence evaluator; actions that result in beneficial consequences are promoted to the robot, those that result in harmful consequences are inhibited. Such a robot could also use its internal model to investigate other potential consequences of actions, thereby enabling it to make choices based on outcomes such as efficiency or safety, as well as ethical behavior. 6.3.4. From Popperian to Gregorian robots Winfield8 proposes how this Popperian architecture might be exploited to enable robots to “tell each other stories”, and become Gregorian learners. Instead of simulating internally generated (imagined) actions, the robot can interpret externally heard (story) actions and action sequences, and hence learn from the experiences of others.

page 232

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Robot Narratives

b4205-v1-ch06

233

Digging into what would be needed to achieve this, we find several requirements. Rather than have to regenerate and test actions in each new context, there needs to be a repository of previous actions — real, imagined, or heard — and their consequences. This repository, along with the generate and test loop, can be used as a basis for future actions, imaginings, and stories. The internal model may be inadequate in various respects, and an imagined action might be judged efficient or ethical or safe, but when carried out in reality, result in unanticipated behaviors or consequences. The consequence evaluator needs to be able to evaluate real world consequences, compare them with modeled consequences, and update both the model and repository as needed. On the social side, the system needs a story parser, to hear stories told by other robots, parse into actions and consequences, and store in the repository for future use. It also needs a story generator, that can take items from the repository, and turn them into stories told using narrative logic. For these to be “stories”, rather than bald sequences of actions, they need to be parsed and assembled through a “narrative logic”, rather than as mere declarative statements. It might be thought useful for the repository to include “evidential markers”, to distinguish consequences determined through real experience (“I did”), imagined experience (“I think”), observations of others’ real experience (“I saw they did”), others’ reported real experience (“they said they did”), others’ imagined experience (“they think”), and so on. See Figure 6.3 for how these components might work. In scenario 1 (a): Robot encounters some steps and wonders how to descend; it uses its internal model to evaluate scenarios until it finds a suitable course of action; it updates its repository with the imagined descent methods and their consequences (“I think I should do X, but not Y or Z”); it successfully descends the steps. Junior arrives and wonders how to descend; Robot tells Junior the story of how it successfully descended (“I did X”); Junior parses the story, imagines it through its own internal model, and learns how to descend. If Robot had instead fallen down the steps, because of a deficient model,

page 233

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch06

M. S. Orell et al.

234 (a)

(b)

Figure 6.3. Scenario 1 (a) telling the story of a successful descent. Scenario 2 (b) telling the story of an imagined descent.

it would then update its model to be a better predictor, and update its repository with the real world consequences of that particular descent method (“I thought I should do X, but I was wrong; I’ll remember not to do that next time”’). Scenario 2 (b): Robot encounters some steps and wonders how to descend; it uses its internal model to evaluate scenarios until it finds a suitable course of action (“I think I should do X, but not Y or Z”). Junior arrives and wonders how to descend; Robot tells Junior the story of how it imagines it should descend (“I think X will work”). If Robot is playing a more educative role, it might also say “and don’t do Y or Z”, problems that it might have imagined in this case, or learned from previous misadventures, or been taught by others when it was a more junior robot itself. If Junior runs these scenarios and finds a problem with X, or no problem with Y or Z, it can ask “why (not)?”; Robot can answer with the relevant consequences from its own more advanced world model, and Junior can update its own model to deliver those consequences, too. 6.3.5. Narrative logic Stories not only serve to share information, they also make sense of it. Work in narrative theory has emphasized the cognitive foundations of

page 234

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Robot Narratives

b4205-v1-ch06

235

narrative in a “story logic”;20 this logic is expressed in the structure of stories, and is the basis of the narrative understanding by which we make sense of stories and use stories to make sense of experience. A story, by adhering to the formal features of story logic, gives narrative framing of the information it contains; it assimilates that information to an established structure of meaning and in doing so attributes a particular relevance and significance to it. Because narrative foregrounds action and events, it is a privileged means of representing the behavior of agents in interaction with their environment and each other. Crucially, stories mediate between the particulars of experience and the general framework of narrative understanding manifest in the set of stories in circulation, or those already familiar to a particular individual (which define that individual’s narrative competence). There is a reflexive relation between particular stories and the story logic they use; each story depends upon the current set of stories as the basis for its intelligibility and significance, but also supplements that set and affects the general context of narrative understanding. The learning potential in this evolving culture of narrative meaning epitomises the distinctive kind of advantage Dennett attributes to Gregorian creatures. Our short play earlier contrasts two statements by the robot character: R: There is a hole. There is a corner. The hole is around the corner. R: I went round that corner yesterday. I fell into a hole. I was stuck there all day. The first set of sentences are declarative statements, and there is no story. The second set of sentences are also grammatically declarative, yet they form an (albeit trivial) story. What is the difference? Consider just the two sentences, “I went around that corner yesterday. I fell into a hole.” From a human perspective this constitutes a narrative, but that does not make it narrative for the

page 235

August 2, 2021

236

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch06

M. S. Orell et al.

robot. Each sentence is a narrative utterance in its own right — by virtue of the action represented in the verbs “went” and “fell” — and the temporal extension of those processes is easily available to inference; but the information they convey might equally be taken as something closer to a status update. More importantly, the narrative articulation between the two sentences is not encoded by anything except their juxtaposition. Nothing positively requires us to understand them as more than a pair of unrelated assertions. If the robots are to make narrative sense of their experiences, and of the stories they tell each other, they need to be provided with a rudimentary narrative logic distinct from their linguistic competence and from their sensory engagement with their environment. Our own predisposition towards narrative sensemaking makes available the inference that the two statements are to be understood sequentially; that falling in the hole followed upon going around the corner. Only on that basis is the further inference available, that falling in the hole was a consequence of going around the corner. This last inference is what gives the utterance its main communicative relevance: beyond the mere declarative information that there is a specific hole around a specific corner, the narrative particulars instantiate a generalization: corners may hide holes. So there are two kinds of narrative implication involved: sequentialcausal implication, and particular-general implication. These are both fundamental narrative heuristics, essential to narratives value as a form of sensemaking, even while they lack logical rigour. Narrative logic is inexact, and prone to fallacy: the post hoc ergo propter hoc fallacy (that what comes after is caused by), and the inductive fallacy (that what is true in this case is true in all cases). As the articulation of temporal experience, narrative is essentially concerned with matters of change and continuity, or of temporal difference and relation; there is no narrative object as such. Its connective logic therefore cannot be made fully explicit, but only pursued to some extent, within an implicit context of assumptions. The effectiveness of narrative in cognition and communication depends upon the cultural process of continuous reflexive refinement of narrative sensemaking through the circulation of stories.

page 236

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Robot Narratives

b4205-v1-ch06

237

6.4. Gregorian Chat We have started from the position that narrative is the way humans understand how the world goes, and have discussed an architecture whereby robots might be endowed with narrative intelligence, too. However, there is nothing in that discussion that requires the stories heard and generated to be based on narrative logic specifically; the same argument could be used to support purely declarative statements. The narrative hypothesis is stronger, and in this section we discuss it in some more depth, and outline how one might go about testing the narrative hypothesis in general, through robots with internal models and story-telling capabilities: Gregorian robots chatting with each other. 6.4.1. The narrative hypothesis in more detail We take the narrative hypothesis, that humans understand how the world goes through stories, and break it down into specific claims: • Narrative frames our understanding of how the world goes, in that we necessarily represent and communicate that knowledge in the elemental form of stories. • The affordances of narrative cognition are the legacy of our evolutionary adaptation to our environment, and set the terms for our continuing understanding of the world. • Any given story mediates between its particulars and the general logic of narrative form; narrative provides a route from episodic to general semantic memory. • Narratives are social: we learn from the stories of others as well as by giving narrative form to our own experience. 6.4.2. The Gregorian chat system Here we tell a story of how the narrative hypothesis might be tested through a series of experiments involving robots with internal models. The plan would be to build a small society of robots, each with an internal model, generate and test loop, consequence evaluator, repository of past actions, and story parser and generator. The robots

page 237

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

238

b4205-v1-ch06

M. S. Orell et al.

(a)

(b)

Figure 6.4. (a) A physical robot (solid line) with a simulated self model (dashed line); (b) the same simulation used to simulated a robot with a simulated selfmodel.

would be placed in a complex environment where they could explore, encounter dangers and rewards, interact with other robots, hear and tell stories of the world, reproduce and evolve. The hypotheses could be tested in various ways, particularly by contrasting robot societies based on declarative logic stories, versus narrative logic stories. Such a project would be an ambitious undertaking, and it would be essential to control the difficulty of implementation, experiment, and evaluation. However, it should be possible to use much “off-theshelf” technology, to augment embodied robots with simulations, and to constrain the environment, as described here. 6.4.2.1. Robot architecture Augmenting embodied robots with simulated robots. As described above, the individual robots need an internal model of themselves, for the Popperian simulation and consequence evaluation. A key insight6, 14 is that the very same simulation approach used for an internal model can be used to simulate multiple instances of a larger population of robots, each with their own internal model. Giving robots comprehensible grammars. The robots’ grammars should conform to (a subset of) English, so the generated stories and declarative statements can be analyzed. One way to accomplish this would be to equip robots with off-the-shelf speech-to-text and text-to-speech, allowing them to hear and produce stories externally, but readily transform this to text. Any errors in this translation,

page 238

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch06

Robot Narratives

239

in either direction, can be considered to form a necessary part of the embodiment.14, 17, 21 Formalizing narrative logic. A crucial part of any investigation of the narrative hypothesis in robots is a need to develop a formalization of narrative logic, such that the robots can construct “stories” rather than declaim a sequence of facts. From the discussion in Section 6.3.5, this is a non-trivial challenge, and, in fact, forms the core of any such investigation: how can we provide a narrative logic that is simultaneously formal enough to be programmed, and supple enough to capture the inexactness, implicit inferencing, and potentially fallacious reasoning underlying our human narrative understanding? 6.4.2.2. Hybrid physical/virtual environment architecture The environment should be sufficiently complex to support a range of useful stories about it. There should be a complex geometry for the robots to navigate, with dangers and rewards, and opportunities to meet, observe, and interact with other robots. This complexity could be achieved without the need for a large physical setup, by exploiting a combination of simulated and physical environments (Figure 6.5): “here”, comprising physical robots in a simple physical social environment, the home campfire, where the robots can tell each other stories of their experiences; “there”, comprising a mix of physical and simulated robots experiencing and observing a complex physical environment supporting adventuregenerating narratives; “yonder”, comprising simulated robots experiencing and observing a range of complex simulated environments supporting more varied narratives.

yonder

there

here

environment orientaƟon layer

Figure 6.5. The environmental architecture, comprising physical (solid line) and simulated (dotted line) robots embodied in physical (solid line) and simulated (dashed line) environments, mediated by an Environment Orientation layer.

page 239

August 2, 2021

240

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch06

M. S. Orell et al.

In order to simplify the robot perception issues, and allow physical and simulated robots to interact, the environment could be implemented using Environment Orientation22 in the form of a “spoken dungeon”. The environment as “dungeon master” could speak aloud cues to the robots — “you are by a river”; “there is an unknown robot behind you”; “Junior has fallen in the river” — and manage the simulated robots. Such an environment allows high levels of control and configuration, which are necessary to test the narrative hypothesis. 6.4.3. Testing the narrative hypothesis in a robot ecology Through such robot and environment architectures, it would be possible to simplify the low-level perception and communication implementations, whilst maintaining the advantages of embodiment,14, 23 and focus on the issues of interest: narrative generation and transmission. Such an approach would allow the narrative hypotheses to be formulated in a concrete form that would allow testing and evaluation in the following way: • Narrative influences understanding: Seed the system with a range of different narrative styles, and observe and analyze the robots’ responses to environmental and social stimuli, both previously seen and novel. • The world influences narratives: Seed the simulations with different environments (for example, safe versus dangerous, simple versus complex, 2D versus 3D) and observe and analyze the differences in the narrative structures that form. • Narratives are social : Implement the same scenarios for collections of Popperian (non-social) and Gregorian (social) robots, and evaluate and compare their responses to situations previously seen by self, by others, or novel. • Narrative converts episodic to generic memory: Seed the robots with narrative and declarative grammars, and evaluate and compare their responses to situations.

page 240

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Robot Narratives

b4205-v1-ch06

241

Such an approach would support an “ecological” system of robots, with predefined grammars and internal models coping with a single generation of the world. 6.4.4. Extensions of the approach The above scenario is in some sense the simplest approach to testing the narrative hypothesis. The same robot and environment architecture could be exploited to test that narrative hypothesis in more depth. It could be extended to a system of evolutionary robots, evolving their internal models and repositories over generations. We hypothesize that the evolved robots would be able to cope with environmental change more robustly than the purely ecological systems. It could be extended to a “nested” model of self, and other robots, where the internal model of self includes its own model of self, and others’ model of self, etc, each with decreasing fidelity, to avoid infinite regress (see Figure 6.6). We hypothesize that the nested models would result in more complex story structures (“I think you said they imagined I did X”). It could be extended to allow for self-modifying narrative logics, where the underlying logic itself is subject to some form of

Figure 6.6. Nested models of self and other’s models, from Ref. [24, Fig. 22.1]. c Julianne D. Halley; used with permission. Robot images

page 241

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

242

b4205-v1-ch06

M. S. Orell et al.

Darwinian or Popperian learning. We hypothesize that the evolving logic, where constrained to physically plausible environments, would result in strange but human-comprehensible stories, whereas a logic evolved to contend with an “alien” environment would result in less comprehensible stories. 6.5. Discussion and Conclusions 6.5.1. Communication and cognition We need to distinguish two different approaches to the idea of robot narrative. First, it is a communicative faculty, used by robots who operate with internal world models, and conduct simulations, etc., but who translate those ways of negotiating their environment into narrative form for the purpose of communication with each other and with humans. Second, the robots may also use narrative as cognitive resource, so that narrative sensemaking is directly part of their engagement with their environment, and the formal basis of their cognition and communication is therefore the same. This second approach is more difficult to implement, but the first allows narrative only a limited role. 6.5.2. Social robots Narrative communication between robots is significant to the extent that they have different experiences, and different cognitive perspectives. The story of the Greek and Roman robot teams illustrates the difference between the social sharing of narrative experience among separate robot minds, and the pooling of information among distributed instances of one collective hive mind. Meaningful communication in general requires both connection and difference; the circulation of stories does not just build a cumulative repository of knowledge, but proliferates interpretations of stories, in the different contexts of the experience of individual robots. The reciprocity between the range of stories and the range of interpretations provides for the possibility of a progressive refinement of the narrative competence of individual robots, and so a rise in the overall narrative competence of the population.

page 242

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch06

Robot Narratives

243

6.5.3. Narrative logic and its interface with world modeling in artificial intelligence Narrative depends on an implicit connective logic, and inferences from it. Because this connective logic concerns change (process, action) it cannot be grounded in the contents of a classical form of world model alone; nor can narrative knowledge be translated into the terms of such a world model without fundamental loss. Equally, because the horizons of the implicit recede continually before any process of cognitive inference that derives explicit assumptions from implicit relations (on the basis of precedent or of principle), narrative logic does not resolve into any final, grounded form in its own terms. It has to remain a provisional resource informing agency within an environment, to be drawn upon within the pragmatic limits of the situated negotiation between robot subject and world (and reciprocally modified by the experience of that negotiation). The distinctive force of narrative communication, according to this line of reasoning, requires it to be assimilated to a narrative mode of cognition that is embodied, situated, and enactive.25 6.5.4. Beyond a “repository of actions”: The particular and the general in narrative Narrative sensemaking concerns the form of particular experiences, and makes sense of the particular by assimilating it to the familiar shapes of the general: to narrative forms, templates and scripts that encapsulate “how the world goes” at different degrees of abstraction. These general narrative forms themselves, however, need to be extrapolated from the particular to the extent that they are not preprogrammed. Such extrapolation is a form of pattern recognition, and is equally involved in narrative sensemaking in response to experience and in interpretation of a communicated story. Without such a capability, a robots repository of actions remains a database of particulars of no relevance to any circumstances except the recurrence of specific situations. The reciprocal dependence between particular acts of narrative sensemaking and general narrative competence tends to compound narratives vulnerability to fallacy, but such a feedback loop is fundamental to narratives cognitive value.

page 243

August 2, 2021

244

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch06

M. S. Orell et al.

6.5.5. Story generator and story parser The most basic challenge confronted by this project is the design of a story generator and a story parser (the two would substantially mirror each other). This could be pursued in the first instance at the level of narrative communication between robots, in which case it is essentially a problem of translation into and out of parameters of the robots world model, and is detached from the question of narratives efficacy as way of negotiating experience in itself. The distinctive value of narrative in this case will be a matter of its utility in the social circulation and consolidation of knowledge. Many of the difficulties to be addressed at the communicative level are essentially the same as those that arise at the cognitive level, with respect to the circuit of narrative sensemaking (making sense of narratives in communication, and using narrative to make sense in cognition). One fundamental difference, however, is that generating and parsing stories in the service of a world model is quite different from generating and parsing stories in the service of enactive experience in an environment. 6.5.6. Preparing for the future Robots now and in the near future are escaping the laboratory and entering into our lives as autonomous entities for social play and pet companions, as health and social care support workers, and more.4 As these robots encroach on our day-to-day lives a key advantage of them learning and expressing themselves in a narrative form is that we should be more able to readily understand them, and be readily understood. There is also the opportunity to share knowledge among robots for group-learning effects. Such robot narratives are not limited to embodied robots but also simulated robots: AI. The concept discussed here, where progress can be made more quickly with some of the benefits of embodiment, extends to AI more generally and the wider set of applications to which AI relates. It would provide us with a useful lens through which to understand, manage and unpack AI decision-making processes since we can interpret the internal models via narrative.

page 244

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Robot Narratives

b4205-v1-ch06

245

Acknowledgments MSO acknowledges funding from the 2017 YCCSA (York Crossdisciplinary Centre for Systems Analysis) summer school. JB, SS, AW, RW acknowledge the support of the Narrating Complexity workshop series, funded by University of York and University of Abertay.

References 1. R. Walsh and S. Stepney. Introduction and overview: Who, what, why. In R. Walsh and S. Stepney (eds.), Narrating Complexity (Springer, 2018), pp. 3–9. 2. H. P. Abbott. Unnarratable knowledge: The difficulty of understanding evolution by natural selection. In D. Herman (e.d.), Narrative Theory and the Cognitive Sciences (CSLI, 2003), pp. 143–162. 3. R. Walsh and S. Stepney, (eds.), Narrating Complexity (Springer, 2018). 4. S. Turkle, Alone Together (Basic Books, 2011). 5. B. Boyd, The evolution of stories: From mimesis to language, from fact to fiction, WIREs Cognitive Sci. 9 (1) (2018). 6. C. Blum, A. F. T. Winfield, and V. V. Hafner, Simulation-based internal models for safer robots. Front. Robot. AI. 4, 74 (2018). 7. A. F. T. Winfield. Robots with internal models: A route to self-aware and hence safer robots. In J. Pitt, (ed.), The Computer After Me: Awareness and Self-Awareness in Autonomic Systems (Imperial College Press, 2014), pp. 237–252. 8. A. Winfield. When robots tell each other stories: The emergence of artificial fiction. In Ref. 3, pp. 39–47. 9. D. C. Dennett, Darwin’s Dangerous Idea (Allen Lane, 1995). 10. S. Doncieux, N. Bredeche, J.-B. Mouret, and A. E. G. Eiben, Evolutionary robotics: What, why, and where to, Front. Robot. AI. 2(4), 1118–1121 (2006). 11. J. Kober and J. Peters. Reinforcement learning in robotics: A survey. In Learning Motor Skills (Springer, 2014), pp. 9–67. 12. J. Bongard, V. Zykov, and H. Lipson, Resilient machines through continuous self-modeling, Sci. 314(5802), 1118–1121 (2006). 13. H. G. Marques and O. Holland, Architectures for functional imagination. Neurocomputing 72(4), 743–759 (2009). 14. P. J. O’Dowd, M. Studley, and A. F. T. Winfield, The distributed co-evolution of an on-board simulator and controller for swarm robot behaviours. Evolution. Intell. 7(2), 95–106 (2014). 15. R. Vaughan and M. Zuluaga, Use your illusion: Sensorimotor self-simulation allows complex agents to plan with incomplete self-knowledge. In From Animals to Animats 9 (Springer, 2006), pp. 298–309.

page 245

August 2, 2021

246

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch06

M. S. Orell et al.

16. A. F. T. Winfield, Experiments in artificial theory of mind: From safety to story-telling, Front. Robot. AI. 5, 75 (2018). 17. J. C. Zagal, J. Ruiz-del Solar, and P. Vallejos, Back to reality: Crossing the reality gap in evolutionary robotics, IFAC Proceedings Volumes 37(8), 834–839 (2004). 18. D. Vanderelst and A. Winfield, An architecture for ethical robots inspired by the simulation theory of cognition. Cognitive Systems Research 48, 56–66 (2018). 19. A. F. T. Winfield, C. Blum, and W. Liu, Towards an ethical robot: Internal models, consequences and ethical action selection. In TAROS 2014: Advances in Autonomous Robotics Systems, number 8717 in LNCS, (Springer, 2014), pp. 85–96. 20. D. Herman, Story Logic: Problems and Possibilities of Narrative (University of Nebraska Press, 2002). 21. S. Stepney, Embodiment. In D. Flower and J. Timmis, (eds.), In Silico Immunology, Chapter 12, (Springer, 2007), pp. 265–288. 22. T. Hoverd and S. Stepney, Environment orientation: A structured simulation approach for agent-based complex system, Nat. Comput. 14(1), 83–97 (2015). 23. A. F. T. Winfield and M. D. Erbas, On embodied memetic evolution and the emergence of behavioural traditions in Robots, Memetic Computing 3(4), 261–270 (2011). 24. S. Stepney and R. Walsh, From simplex to complex narrative? In R. Walsh and S. Stepney, (eds.), Narrating Complexity (Springer, 2018), pp. 319–322. 25. A. Clark, Being There: Putting Brain, Body and World Together Again (Oxford University Press, 1997).

page 246

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch07

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 0007

Chapter 7

Evolving Boolean Regulatory Networks with Variable Gene Expression Times Larry Bull Department of Computer Science & Creative Technologies, University of the West of England, Bristol BS16 1QY, UK [email protected] The time taken for gene expression varies not least because proteins vary in length considerably. This chapter uses an abstract, tuneable Boolean regulatory network model to explore gene expression time variation. In particular, it is shown how non-uniform expression times can emerge under certain conditions through simulated evolution. That is, gene expression time variance can be beneficial in the shaping of the dynamical behavior of the regulatory network without explicit consideration of protein function.

7.1. Introduction A protein’s function is dependent upon its tertiary (3D) structure which in turn is dependent upon the primary structure of the amino acid sequence by which it is specified. Typically, the more amino acids in the primary structure, the more complex the tertiary structure. Similarly, the more amino acids, the longer gene expression can be expected to take. The lengths of genes/amino acid sequences varies considerably within and across taxa. It is well established that, due to chemical equivalences between amino acid sequences, there is a strong neutrality effect at the molecular level of evolution (e.g., Ref. [1]). However, this does not fully explain the differences in the distribution of protein lengths seen, nor why eukaryotic proteins are typically larger than bacterial proteins (e.g., Ref. [2]). 247

page 247

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

248

b4205-v1-ch07

L. Bull

With the aim of enabling the systematic exploration of artificial genetic regulatory network (GRN) models, a simple approach to combining them with abstract fitness landscapes has been presented.3 More specifically, random Boolean networks (RBN)4 were combined with the NK model of fitness landscapes.5 In the combined form — termed the RBNK model — a simple relationship between the states of N randomly assigned nodes within an RBN is assumed such that their value is used within a given NK fitness landscape of trait dependencies. This chapter explores the introduction of variable expression times to the genes in a traditional Boolean regulatory network within the RBNK model. That is, the effects of protein length variation are considered based purely upon the dynamical behavior of the regulatory network being shaped under an evolutionary process. It is shown that non-uniform gene expression times can be selected for and that a relationship appears to exist between gene length and gene connectivity in such cases.

7.2. The RBNK Model Within the traditional form of RBN, a network of R nodes, each with a randomly assigned Boolean update function and B directed connections randomly assigned from other nodes in the network, all update synchronously based upon the current state of those B nodes (Figure 7.1). Hence those B nodes are seen to have a regulatory effect upon the given node, specified by the given Boolean function attributed to it. Since they have a finite number of possible states and they are deterministic, such networks eventually fall into an attractor. It is well established that the value of B affects the emergent behavior of RBN wherein attractors typically contain an increasing number of states with increasing B (see Ref. [6] for an overview). Three regimes of behavior exist: ordered when B = 1, with attractors consisting of one or a few states; chaotic when B ≥ 3, with a very large number of states per attractor; and, a critical regime around B = 2, where similar states lie on trajectories that tend to neither diverge nor converge (see Ref. [7] for formal analysis). Note that traditionally the size of an RBN is labeled N , as opposed to R

page 248

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Evolving Boolean Regulatory Networks

b4205-v1-ch07

249

here, and the degree of node connectivity labeled K, as opposed to B here. The change is adopted due to the traditional use of the labels N and K in the NK model of fitness landscapes which are also used in this chapter, as will be shown. Kauffman and Levin5 introduced the NK model to allow the systematic study of various aspects of fitness landscapes (see Ref. [6] for an overview). In the standard NK model an individual is represented by a set of N (binary) genes or traits, each of which depends upon its own value and that of K randomly chosen others in the individual (Figure 7.1). Thus increasing K, with respect to N , increases the epistasis. This increases the ruggedness of the fitness landscapes by increasing the number of fitness peaks. The NK model assumes all epistatic interactions are so complex that it is only appropriate to assign (uniform) random values to their effects on fitness. Therefore, for each of the possible K interactions, a table of 2(K+1) fitnesses is created, with all entries in the range 0.0 − 1.0, such that there is one fitness value for each combination of traits. The fitness contribution of each trait is found from its

(a)

(b)

Figure 7.1. Example traditional RBN (a) and NK (b) models. Both contain three genes mutually connected, with the state-transition/fitness-contribution table shown for one gene in each case.

page 249

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

250

b4205-v1-ch07

L. Bull

Figure 7.2. Example RBNK model. Dashed lines and nodes indicate where the NK fitness landscape is embedded into the RBN model.

individual table. These fitnesses are then summed and normalized by N to give the selective fitness of the individual. Exhaustive search of NK landscapes8 suggests three general classes exist: unimodal when K = 0; uncorrelated, multi-peaked when K > 3; and, a critical regime around 0 < K < 4, where multiple peaks are correlated. As shown in Figure 7.2, in the RBNK model N nodes (where 0 < N ≤ R) in the RBN are chosen as outputs, that is, their state determines fitness using the NK model. The combination of the RBN and NK model enables a systematic exploration of the relationship between phenotypic traits and the genetic regulatory network by which they are produced. It was previously shown how achievable fitness decreases with increasing B, how increasing N with respect to R decreases achievable fitness, and how R can be decreased without detriment to achievable fitness for low B.3 In this chapter N phenotypic traits are attributed to randomly chosen nodes within the network of R genetic loci (Figure 7.2). Hence the NK element creates a tuneable component to the overall fitness landscape. Selfconnection by nodes is allowed. 7.3. The RBNK Model with Variable Gene Expression Times 7.3.1. Gene expression Within the traditional form of RBN each node updates synchronously, in parallel, taking one time-step. That is, each gene has

page 250

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Evolving Boolean Regulatory Networks

b4205-v1-ch07

251

the same length or expression time: one update cycle. To include a mechanism which enables the variation in time taken for expression, each node in the RBN is extended to (potentially) include a time from a specified range by which to delay updating its state once turned on. Hence on each cycle, each single time-step node updates its state based upon the current state of the B nodes it is connected to using the Boolean logic function assigned to it in the standard way. Any nodes marked as having a longer expression time are checked to see if they have previously been switched on and have waited the number of update cycles associated with them. If this is the case, the node is either turned on and the counter reset, otherwise its counter is incremented and it remains off. Such nodes remain on until they are switched off by their Boolean function; returning to the off state and a subsequent delay for continued expression is not explored here. 7.3.2. Experimentation For simplicity with respect to the underlying evolutionary search process, a genetic hill-climber is considered here, as in Ref. [3]. Each RBN is represented as a list to define each node’s Boolean function, B connection ids, an update time delay, and whether it is a time delayed node or not. Mutation can therefore either (with equal probability): alter the Boolean function of a randomly chosen node; alter a randomly chosen B connection; turn a node into or out of being a time delayed node; or, alter a time delay, if it is a delayed node. A single fitness evaluation of a given GRN is ascertained by updating each node for 100 cycles from a randomly defined genome start state. On the last update cycle, the value of each of the N trait nodes in the GRN is used to calculate fitness on the given NK landscape. This process is repeated ten times per run. A mutated GRN becomes the parent for the next generation if its fitness is higher than that of the original. In the case of fitness ties the number of time delayed nodes is considered, with the smaller number favoured, the decision being arbitrary upon a further tie. Hence there is a slight selective pressure against variable expression times. Here R = 100, N = 10 and results are averaged over 100 runs — 10 runs (each of 10 random starts) on each of 10 landscapes per parameter

page 251

August 2, 2021

252

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch07

L. Bull

configuration — for 5000 generations. As in Ref. [3], 0 < B ≤ 5 and 0 ≤ K ≤ 5 are used. Expressions times (T ) were able to vary up to 10 update cycles (1 ≤ T ≤ 10), where T = 0 is the traditional case of no delay. Figure 7.3(a) shows how there is a significant (T -test, p < 0.05) drop in fitness for B > 2 compared to B < 3 regardless of K. Moreover, it can be seen that around 3% of nodes, on average, have an expression time of longer than one update cycle for B > 1, for all K. Figure 7.3(b) also shows the average expression time for those nodes. As can be seen, for B = 2, T ≈ 2, whereas there appears to be no selective pressure on T in the cases where evolution struggles to produce high fitness networks, that is, when B > 2, with average times of around five cycles (in a range 1–10) typically seen. 7.3.3. Asynchronous experimentation As noted above, traditional RBN update synchronously, that is, a global clock signal is assumed to exist. It has long been suggested that this assumption is less than realistic for natural systems and hence discrete dynamical models have also used asynchronous updating (e.g., Ref. [9]). Harvey and Bossomaier10 presented an asynchronous form of RBN wherein a node is picked at random (with replacement) to be updated, with the process repeated R times per cycle to give equivalence to the synchronous case. The resulting loss of determinism means such networks no longer fall into regular cyclic attractors, rather they typically fall into so-called “loose” attractors where “the network passes indefinitely through a subset of its possible states”.10 Many forms of asynchronous updating are possible (e.g., see Ref. [11] for an overview) but the simple random scheme is used here. It can also be noted that Gershenson12 has introduced a deterministic asynchronous update scheme wherein individual nodes are given an update frequency from a range, similar to above. Using very small RBN, it is suggested that such updating typically increases the length and number of attractors for low network connectivity B.11 Figure 7.4(a) shows how there is a significant (T -test, p < 0.05) drop in fitness for B > 2 compared to B < 3 regardless of K, as in the

page 252

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Evolving Boolean Regulatory Networks

(a)

b4205-v1-ch07

253

(b)

Figure 7.3. Evolutionary behavior after 5000 generations in the RBNK model for R = 100, N = 10 and various B and K combinations. (a) shows fitnesses and the percentage of nodes with delayed expression, the (b) shows the corresponding average delay. Error bars show min and max values.

synchronous case above (see also Ref. [3]). Again, it can be seen that around 3% of nodes, on average, have an expression time of longer than one update cycle for all K. In contrast to the synchronous case above, this is also true for B = 1. As noted in Section 7.2, such RBN

page 253

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch07

L. Bull

254

(a)

(b)

Figure 7.4. Evolutionary behavior after 5000 generations in the asynchronous RBNK model for R = 100, N = 10 and various B and K combinations. (a) shows fitnesses and the percentage of nodes with delayed expression, the (b) shows the corresponding average delay. Error bars show min and max values.

typically exhibit point attractors whereas this is not the case for asynchronous updating. Analysis in all cases shows that the delayed nodes form part of the subset changing state within the attractor, which corresponds with the B = 1 result for synchronous updating

page 254

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Evolving Boolean Regulatory Networks

b4205-v1-ch07

255

as they are not used. Figure 7.4(b) shows the average expression time for the delayed nodes. As can be seen, there again appears to be no selective pressure on T in the cases where evolution struggles to produce high fitness networks, that is, when B > 2. For B = 1, T ≈ 1 which is significantly (T -test, p < 0.05) less than for B = 2 where T ≈ 2. The latter value for T is the same as for synchronous updating above. Similar general findings were found for other ranges of T (not shown). 7.4. Variable Sized GRN with Variable Gene Expression Times 7.4.1. Emergent complexity In the above experimentation, the total number of nodes R within the GRN was fixed. Using a version of the NK model, Harvey13 showed, by including a bias, that gradual growth through small increases in genome length via mutation is sustainable whereas large increases in genome length per growth event is not sustainable. This is explained as being due to the fact that a degree of correlation between the smaller fitness landscape and the larger one must be maintained; a fit solution in the former space must achieve a suitable level of fitness in the latter to survive into succeeding generations. Kauffman and Levin5 discussed this general concept with respect to fixed-size NK landscapes and varying mutation step sizes therein. They showed how for long jump adaptations, that is, mutation steps of a size which go beyond the correlation length of a given fitness landscape, the time taken to find fitter variants doubles per generation. Harvey’s13 growth operator is a form of mutation which adds g random genes to an original genome of length G. Hence he draws a direct analogy between the size of g and the length of a jump in a traditional landscape; the larger g, the less correlated the two landscapes will be regardless of the underlying degree of correlation of each. Aldana et al.14 have examined the effects of adding a new, single gene into a given RBN through duplication and divergence. They find, somewhat reminiscent of Harvey’s result, that the addition of one gene typically only slightly alters the attractors of the resulting RBN when B < 3. Attractor structure is not conserved in the chaotic regime, however.

page 255

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

256

b4205-v1-ch07

L. Bull

The experiments reported above have been repeated with the addition of two extra “macro” mutation operators: one to delete the end node (the N trait nodes cannot be deleted), randomly reassigning any connections to it; and, one to add a random node on to the end of the genome, connecting it to a randomly chosen node in the network. These two operators occur with equal probability to the previously described mutation operators. The replacement process is also altered such that, when fitnesses and the number of delayed nodes are equal, the smaller network is kept, with ties again broken at random. RBNs were initialized with R = 100, as before. 7.4.2. Experimentation In all cases, no significant change in the fitness of solutions is seen (not shown). There is also, typically, no significant effect on the resulting size of the networks. However, as can be seen in Figure 7.5, for low connectivity (B < 3), regardless of K and the updating scheme, the networks decrease in size by around a half — a statistically significant change (T -test, p < 0.05). That is, not only do low connectivity networks evolve the highest fitnesses for all K, they are able to do so with a smaller number of nodes R. It is known that both the number of states in an attractor and the number of attractors are dependent upon R within RBN, and that the general form of those relationships changes for low and high connectivity. For example, with synchronous updating, when B = 2, attractors are typically of size R0.5 , whereas, when B = R, attractors typically contain 0.5 × 2R/2 states (e.g., see Ref. [6] for a summary). Hence the evolutionary process appears able to exploit the potential for ever smaller attractors for the low B cases, driven by the selection pressure for network size reduction, and to do so whilst maintaining fitness. In this case, incrementally decreasing R is sustainable since the subsequent change in the attractors is sufficiently small for low B, whereas the same change in R appears to cause a significant change in the attractor space for high B. This result is somewhat anticipated by those of Aldana et al.14 and Harvey13 but is also in the opposite direction: small reductional changes are maintained.

page 256

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Evolving Boolean Regulatory Networks

(a)

b4205-v1-ch07

257

(b)

Figure 7.5. Example behavior after 5000 generations when the RBN size is able to vary. (a) shows results for synchronous updating, with the fraction of the original length and the percentage of nodes with delayed expression (top), and the average expression time (bottom). (b) shows the same for asynchronous updating.

This general result was also found in Ref. [3] but is slightly altered by the inclusion of an expression delay since the networks for B = 1 are larger here for both update schemes (not shown). That is, as the results in Section 7.3 suggest, varying the gene expression time provides evolution with a mechanism through which to alter the attractor space. Varying the size of the network is another mechanism through which this can be achieved. When both mechanisms are made available to evolution, both appear to be used; there is a trend towards fewer delayed nodes per network when size is also varying. It can also be noted that a drop in the average delay for B = 2 with asynchronous updating is seen when size is varying.

page 257

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

258

b4205-v1-ch07

L. Bull

7.5. Conclusions The length of genes and hence the length of the amino acid sequences which specify proteins varies considerably within and across taxa. This chapter has considered the effects of protein length variation based purely upon the dynamical behavior of regulatory networks being shaped under evolution. It has been shown that non-uniform gene expression times can be selected for under low gene connectivity and that such genes typically have an expression delay proportional to the degree of connectivity. This general result appears to correspond with the natural case: it is known that eukaryotic organisms typically exhibit longer proteins than prokaryotes (e.g., Ref. [15]); and, it is also known that organisms such as E. coli have a lower average level of gene connectivity than higher organisms such as S. cerevisiae (e.g., Ref. [15]). As such, the use of non-uniform gene expression times may prove beneficial in work considering the evolution of artificial genetic regulatory networks (e.g., see Ref. [3] for an overview) for computation. References 1. M. Kimura, Evolutionary rate at the molecular level. Nature 217, 624–626 (1968). 2. A. Tiessen, P. Perez-Rodriguez, and L. Delaye-Arredondo, Mathematical modeling and comparison of protein size distribution in different plant, animal, fungal and microbial species reveals a negative correlation between protein size and protein number, thus providing insight into the evolution of proteomes. BMC Res. Notes 5, 85 (2012). 3. L. Bull, Evolving Boolean networks on tuneable fitness landscapes. IEEE Trans. Evolution. Comput. 16(6), 817–828 (2012). 4. S. A. Kauffman, Metabolic stability and epigenesis in randomly constructed genetic nets. J. Theoretical Biology 22, 437–467 (1969). 5. S. A. Kauffman, and S. Levin, Towards a general theory of adaptive walks on rugged landscapes. J. Theoret. Biol. 128, 11–45 (1987). 6. S. A. Kauffman, The Origins of Order (Oxford, 1993). 7. B. Derrida and Y. Pomeau, Random networks of automata: A simple annealed approximation. Europhys. Lett. 1, 45–49 (1986). 8. J. Smith and R. E. Smith, An examination of tuneable random search landscapes. In W. Banzhaf and C. Reeves (eds.), Foundations of Genetic Algorithms V (Morgan Kauffman, 1999), pp. 165–182.

page 258

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Evolving Boolean Regulatory Networks

b4205-v1-ch07

259

9. K. Nakamura, Asynchronous cellular automata and their computational ability. Systems, Computers, Controls 5(5), 58–66 (1974). 10. I. Harvey and T. Bossomaier, Time out of joint: Attractors in asynchronous random Boolean networks. In P. Husbands and I. Harvey (eds.), Proceedings of the Fourth European Artificial Life Conference (MIT Press, 1997), pp. 67–75. 11. C. Gershenson, Updating schemes in random Boolean networks: Do they really matter? In J. Pollack et al. (eds.), Artificial Life IX (MIT Press, 2004), pp. 238–243. 12. C. Gershenson, Classification of random Boolean networks. In R. Standish et al. (eds), Artificial Life VIII (MIT Press, 2002), pp. 1–8. 13. I. Harvey, Species adaptation genetic algorithms: A basis for a continuing SAGA. In F.J. Varela and P. Bourgine (eds.), Toward a Practice of Autonomous Systems: Proceedings of the First European Conference on Artificial Life (MIT Press, 1992) pp. 346–354. 14. M. Aldana, E. Balleza, S. Kauffman, and O. Resendiz, Robustness and evolvability in genetic regulatory networks. J. Theoret. Biol. 245, 433–448 (2007). 15. R. Leclerc, Survival of the sparsest. Molecular Syst. Biol. 4, 213–216 (2008).

page 259

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 0008

Chapter 8

Membrane Computing Concepts, Theoretical Developments and Applications Erzsébet Csuhaj-Varj´ u∗ , Marian Gheorghe†, Alberto Leporati‡ , ´ Miguel Angel Mart´ınez-del-Amor§, Linqiang Pan¶ , Prithwineel Paul , Andrei P˘ aun∗∗ , Ignacio Pérez-Hurtado§, Mario J. Pérez-Jiménez§, Bosheng Song†† , Luis Valencia-Cabrera§, Sergey Verlan‡,‡ , Tingfang Wu§§ , Claudio Zandron‡ and Gexiang Zhang¶¶ ∗

E¨ otv¨ os Lor´ and University, Budapest, Hungary † Bradford University, Bradford, UK ‡ Universit` a degli Studi di Milano-Bicocca, Milano, Italy § Universidad de Sevilla, Sevilla, Spain ¶ Huazhong University of Science and Technology, Wuhan, China Southwest Jiaotong University, Chengdu, China ∗∗ University of Bucharest, Bucharest, Romania †† Hunan University, Changsha, China ‡‡ Université Paris Est Créteil, Créteil, France §§ Soochow University, Suzhou, China ¶¶ Chengdu University of Technology, Chengdu, China [email protected] This chapter discusses the key concepts in Membrane Computing, with some examples illustrating their usage, the main theoretical developments, by pointing at some of the most significant results, and a set of applications in various areas, as well as adequate tools utilized in the simulation, anlysis and verification of membrane systems.

8.1. Introduction Membrane computing is a branch of natural computing inspired by the functions and structure of the living cell with its compartments

261

page 261

August 2, 2021

262

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

separated by membranes, each of them hosting internal bio-chemical reactions, as well as intermembrane interactions. The model has been initiated by Gh. P˘aun in the seminal paper Ref. [1]. Since then the field has grown very fast, following various research themes, including both fundamental theoretical aspects and applications in various areas. A handbook including main research topics, at the level of 2010, was produced.2 In the last five years there have been published a few more research monographs on applications of membrane computing in designing robot controllers,3 theory and applications of membrane computing4 and real-life applications of this model.5 International Journal of Membrane Computing (JMC) was launched at the beginning of 2019 and the first four issues of the first volume have been published. The foreword presents a brief history of the research developments of membrane computing in the last two decades and some of the key achievements.6 These four JMC issues cover some of the most important research themes of this field. The theoretical thread consisting of research on the computational power of various variants of membrane systems is represented by investigations on the use of rules applied through synchronization relationship7 ; membrane duplication as biopolymer duplication8 ; metabolic P systems9 ; P colonies10 ; spiking neural P systems with generalized set of rules,11 generating context-free languages,12 or using matrix representation13 ; polymorphic P systems applied for the inference of bounded L systems.14 Another theoretical aspect is that of membrane systems complexity, which is illustrated by papers on variants of polarizationless P systems with active membranes15 ; P systems with proteins16 ; shallow non-confluent P systems characterizing PSPACE17 ; minimal cooperation in cell-like P systems18 ; P systems with compound objects providing solutions to the graph coloring problem19 ; P systems attacking hard problems beyond NP.20 Applications of membrane systems are also well represented by papers regarding membrane computing utilized in designing robot

page 262

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

263

controllers21 ; modeling market interactions using PDP systems22 ; modeling swarming and aggregation in a Myxobacterial colony using P systems23 ; membrane computing used in image processing24 ; search-based testing for P systems25 ; P-lingua simulation tool26, 27 ; P systems implementation using multi-vesicular liposomes.28 There are also papers studying connections of membrane computing with other computational paradigms, such as reaction systems29 ; networks of bio-inspired processors30 ; hyperparameter optimization. 31 Just to emphasize the quality of the papers published so far, two of the above mentioned publications have been awarded, by the International Membrane Computing Society Board, the best theoretical paper7 ; and the best application28 for 2019. Section 8.2, is introduced some of the key concepts related to the main types of membrane systems and several examples illustrate their usage. Section 8.3.2 presents main theoretical achievements with respect to the computational power and complexity theory of membrane computing. This nature-inspired computing paradigm has been investigated not only in connection with the theory of computation, but has been utilized in modeling various applications32 and more recently systems and synthetic biology problems.33 In these circumstances, it is surprising to find out that membrane systems “are more suitable for the theory of computation than for modeling in systems biology”.34 Section 8.4 presents some representative examples of models in various areas, including ecosystems, path planning and control of autonomous mobile robots, fault diagnosis in power systems and other engineering optimization problems. These are based on some of the most recent applications of membrane computing published after 2017, designing robot controllers3 and modeling real-life applications.5 We hope that this broad pallette of models makes a convincing case for the use of membrane systems as a modeling tool. Section 8.5 describes a set of tools utilized in the simulation, analysis and verification of membrane systems. They represent effective support in any modeling approach based on membrane systems.

page 263

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

264

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

8.2. Types of P Systems In the following subsections, we will provide a brief description of some basic types of membrane systems, to give an insight in the most important models of the membrane computing theory. 8.2.1. Preliminaries The main component of a generic P system is a cell-like membrane structure which consists of regions (compartments) that are separated by membranes from each other and are organized in a tree structure. Each region contains a multiset of objects, that may be empty, and possibly other regions. The objects represent molecules (bio-chemical ingredients). Each region is associated with a set of rules (maybe empty), which are applied to the multisets of objects in the region. The rules can serve for evolution (for changing the multisets of objects) and/or communication, that is, they transport multisets of objects from one region to another one. The rules usually are applied in the non-deterministic maximally parallel way (maximally parallel way, for short), that is, in every computation step (in every configuration change) a maximal multiset of nondeterministically chosen rules is applied to the objects in the regions of the membrane system. We note that in P systems theory, networks of regions (compartments as nodes of a virtual graph) are also considered, modeling tissues and neural networks. Throughout the chapter, we suppose the reader is familiar with basics of formal languages and computability theory. We recall some notions and notations we will use subsequently. Let V be an alphabet (a finite nonempty set), V ∗ be the set of all words over V , and let V + = V ∗ − {λ} where λ denotes the empty word. The length of a word w ∈ V ∗ is denoted by |w|, and |w|a denotes the number of occurrences of a symbol a ∈ V in w. N is the notation for the set of natural numbers. Let O be a set of objects. A multiset is a pair M = (V, f ), where V is an arbitrary (not necessarily finite) set of objects from O and f : O → N is a mapping which assigns to each object its multiplicity; if a ∈ / V then f (a) = 0. The support of M = (V, f ) is

page 264

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

Membrane Computing Concepts

265

the set supp(M ) = {a ∈ V | f (a) ≥ 1}; if supp(M ) is a finite set, then M is called a finite multiset. A multiset M over the finite set of objects V can be represented as any string w over the alphabet V with |w|a = f (a), a ∈ V , λ represents the empty multiset. In the following, for any finite multiset of objects M = (V, f ) we may use notation w ∈ V ∗ , where w represents M . 8.2.2. Transition P systems The transition P system is the generic variant of P system.1 Often, a simplified variant of this construct is used, called symbol-object P system, where dissolution of membranes is not allowed and no priority relations of the rules are introduced. To provide the reader with a more complete picture, we recall the original definition. Definition 1. A transition P system of degree n ≥ 1 is a construct Π = (Γ, μ, w1 , . . . , wn , (R1 , ρ1 ), . . . , (Rn , ρn ), iout ), where Γ is an alphabet; its elements are called objects; μ is a membrane structure of degree n, with the membranes and the regions labeled in a one-to-one manner; wi , 1 ≤ i ≤ n, are multisets over Γ∗ , defining initial multisets associated with the regions 1, 2, . . . , n of μ; Ri , 1 ≤ i ≤ n, are finite sets of evolution rules over Γ associated with the regions 1, 2, . . . , n of μ, ρi is a partial order relation over Ri , 1 ≤ i ≤ n, specifying a priority relation among rules of Ri . An evolution rule is a pair (u, v), which we will usually write in the form u → v, where u is a multiset over Γ, and v = v or v = v δ, where v is a multiset over (Γ × {here, out}) ∪ (Γ × {inj | 1 ≤ j ≤ n}), and δ is a special symbol not in Γ. The length of u is called the radius of the rule u → v. iout ∈ {1, 2, . . . , n} is the label of the output membrane. We start to examine the rules in decreasing order of their priority and assign objects to them. A rule can be used only when there are copies of the objects whose evolution it describes and which were not “consumed” by rules of a higher priority and, moreover, there is no rule of a higher priority, irrespective of which objects it involves, which is applicable at the same step.

page 265

August 2, 2021

266

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

When a rule u → v ∈ Ri is applied, all objects in u are consumed, and multiset v of objects are produced, moving the objects from v according to the communication commands here, out, inj associated with them: if we have (b, tar) in v, then b remains in the same region if tar = here, it exits the region through its delimiting membrane if tar = out, and it goes into the immediately lower membranes j if tar = inj (if there is no lower membrane j, then a rule which introduces a symbol (b, inj ) is not allowed to be used). In general, the indication here is not explicitly written. If the symbol δ appears in v, then the membrane i is dissolved and at the same time the set of rules in Ri (and its associated priority relation) is removed. The rules of a transition P system are used in a maximally parallel way. A configuration of a transition P system at any moment is described by the current membrane structure (the rooted tree), together with all multisets of objects over Γ associated with the regions of this membrane structure at that moment. Starting from the initial configuration (consisting of the initial membrane structure and the initial multisets in the regions of the membrane structure) and applying rules in a maximally parallel way, one obtains a sequence of consecutive configurations. Each passage from a configuration to a next configuration is called a transition. A sequence of transitions starting from the initial configuration is a computation. A halting configuration is a configuration where no rule of the system is applicable to it. Only halting computations give a result, encoded by the number of elements of multiset of objects present in the output region iout . For a P system, Π, N (Π) denotes the set of all the numbers computed by Π. Transition P systems were first proposed in Ref. [1] and then these were widely investigated. In Ref. [1], an interesting variant of the transition P system, called catalytic P system, was proposed (and studied in depth in Refs. [35–38]). In Ref. [39], evolutioncommunication P systems with energy and non-cooperative transition P systems without dissolution were considered. Cooperative transition P systems can also be simulated by evolution-communication P systems with energy.40

page 266

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts 1

2

3

b4205-v1-ch08

267

ac a → ab a → bδ c → cc (cc → c) > (c → δ) b→e

Figure 8.1.

An example of a transition P system.

We present an example to illustrate how a transition P system works. Let Π1 be a transition P system, where initial membrane structure, multiset of objects and rules in each membrane are presented in Figure 8.1, where i0 = 1. Because of the non-deterministic nature of transition P systems, there are many different paths of computation. We present only one of them. Initially, only membrane 3 has an initial mutiset, consisting of objects a, c. If rules a → ab and c → cc are non-deterministically chosen in step 1, then the contents of membrane 3 will change to a, b, c, c. At step 2, if rules c → cc and a → bδ are applied, then contents of this membrane will become b, b, c, c, c, c, but the membrane is dissolved and all its objects are released into membrane 2, and the rules of membrane 3 will disappear. At step 3, in membrane 2, rules b → e and cc → c (because of priority relation) are used in parallel and objects e, e, c, c are produced. At step 4, only rule cc → c can be applied, hence objects e, e, c are obtained. At step 5, only rule c → δ can be used, membrane 2 is dissolved and objects e, e are released into membrane 1, all the rules of membrane 2 will disappear. As no rule is applicable, the result, in i0 = 1, is ee. 8.2.3. P systems with active membranes Definition 2. A P system with active membranes of degree m ≥ 1 is a tuple Π = (O, H, μ, w1 , . . . , wm , R, iout ),

page 267

August 2, 2021

268

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

where O is an alphabet of objects; H is the alphabet of labels for membranes; μ is the initial membrane structure, of degree m, with all membranes labeled with elements of H and having neutral polarization, 0; w1 , . . . , wm are multisets over O, called the inital multisets of objects present in the compartments of μ; R is a finite set of rules of the following forms (s is the label of the skin membrane — at the outter layer): (a) [ a → v ] eh , for h ∈ H, e ∈ {+, −, 0}, a ∈ O, v ∈ O∗ (object evolution rules, associated with membranes and depending on the label h and the charge e of the membranes, but not directly involving the membranes, in the sense that the membranes are neither taking part to the application of these rules nor are they modified by them); (b) a[ ] eh1 → [ b ] eh2 , for h ∈ H, e1 , e2 ∈ {+, −, 0}, a, b ∈ O (in communication rules; an object is introduced in the membrane, maybe modified during this process; also the polarization of the membrane can be modified, but not its label); (c) [ a ] eh1 → [ ] eh2 b, for h ∈ H, e1 , e2 ∈ {+, −, 0}, a, b ∈ O (out communication rules; an object is sent out of the membrane, maybe modified during this process; also the polarization of the membrane can be modified, but not its label); (d) [ a ] eh → b, for h ∈ H − {s}, α ∈ {+, −, 0}, a, b ∈ O (dissolving rules; in reaction with an object, a membrane can be dissolved, while the object specified in the rule can be modified); (e) [ a ] eh1 → [ b ] eh2 [ c ] eh3 , for h ∈ H − {s}, e1 , e2 , e3 ∈ {+, −, 0}, a, b, c ∈ O (division rules for elementary membranes; in reaction with an object, the membrane is divided into two membranes with the same label, maybe of different polarizations; the object specified in the rule is replaced in the two new membranes by possibly new objects; all objects different from a are duplicated in the two new membranes); (f) [ a ] eh1 → [ O1 ] eh2 [ O2 ] eh3 , for h ∈ H − {s}, e1 , e2 , e3 ∈ {+, −, 0}, a ∈ O, O1 ∪ O2 = O and O1 ∩ O2 = ∅ (separation rules for elementary membranes; in reaction with an

page 268

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

269

object, the membrane is separated into two membranes with the same label, maybe of different polarizations; the object a can evolve, the objects from O1 are placed in the first membrane, those from O2 are placed in the second membrane). iout ∈ {0, 1, . . . , m} indicates the region where the result of a computation is obtained (0 represents the environment). Rules of P systems with active membranes are applied in the non-deterministic maximally parallel manner according to these constraints: any object can be subject of only one rule of any type and any membrane can be subject of only one rule of types (b)–(f); rules of type (a) are not counted as applied to membranes, but only to objects (when a rule of type (a) is applied, the membrane can also evolve by means of a rule of another type); as usual, if a membrane is dissolved, then all the objects (and membranes, in the case that the division rule is applied to a non-elementary membrane) in its region are left free in the surrounding region; if a rule of type (e) or (f) is applied to a membrane, and its inner objects and membranes evolve at the same step, then it is assumed that first the inner objects and membranes evolve and then the division takes place, so that the result of applying rules inside the original membrane is replicated in the two new membranes (in short, the rules are applied in a bottom-up manner). Of course, the rules associated with a membrane h are used for all copies of this membrane, irrespective whether or not the membrane is an initial one or it is obtained by division. The skin membrane cannot be dissolved or divided, but it can be subject of in/out operations; because the environment is empty at the beginning, only objects which were expelled from the system can be present there, hence only such objects can be brought back to the system. The configurations of a P system with active membranes Π identify the current membrane structure and the multisets of objects present in its compartments; (μ, w1 , . . . , wm ) is the initial configuration. Starting from the initial configuration and applying rules in the non-deterministic maximally parallel manner, one obtains a sequence of consecutive configurations. Similar to transition P systems, one

page 269

August 2, 2021

270

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

can define transitions, halting configuration, computation and the result of a computation. Since the model of P systems with active membranes was first proposed in Ref. [41], several variants of P systems with active membranes were considered. The idea of trading polarizations for labels was proposed in Ref. [42]. Motivated by the biological phenomenon of separation, the concept of membrane separation in relation to P systems with active membranes was considered in Ref. [43]. A slightly different variant of P system with active membranes, whereby membrane creation is considered, was proposed in Ref. [44]. In Ref. [45], the notion of strong (non-elementary) division was proposed. In this case one can choose which membranes are copied into each of the resulting membranes. For further variants of P systems with active membranes, one can refer to Refs. [46–49]. Besides the use of the maximal parallelism, some other strategies of using the rules were also considered in the case of P systems with active membranes: minimal parallelism (each membrane which can evolve in a given step should do it by using at least one rule)50 ; timefreeness (the result generated by the given P system does not depend on the execution time of rules)51, 52 ; asynchronous (in each transition of a P system an arbitrary number of rules may be applied in any of its compartments).53 We give an example to illustrate how a P system with active membranes works. Let Π2 = ({a, b, c, d, e, f }, {1, 2}, [ [ ] 02 ] 01 , {a}, ∅, R, 1) be a P system with active membranes, where R contains the following rules: + + − r1 : [ a → bc ] 01 ; r2 : b[ ] 02 → [ d ] + 2 ; r3 : [ d ] 2 → [ e ] 2 [ f ] 2 ; − r4 : [ e ] + 2 → c; r5 : [ f ] 2 → c.

The computation of the P system with active membranes is described as follows. Initially, only membrane 1 contains an initial object, a. Hence at step 1, rule r1 is used and b, c obtained in membrane 1. At step 2, rule r2 is applied, object b, outside of membrane 2 is rewritten into d and sent inside of the membrane,

page 270

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

Membrane Computing Concepts

271

also the polarization of the membrane 2 is changed from neutral to positive. At step 3, rule r3 can be used, membrane 2 is divided into two membranes with the same label, object d is transformed into e, in one of the membranes, and f , into the other one. At step 4, rules r4 , r5 are applied in parallel, two membranes with label 2 are dissolved, and objects e and f are rewritten into one c each. The system halts, with membrane 1 containing three copies of object c. 8.2.4. Communication P systems In the case of the previously discussed P systems, the objects are both transformed (rewritten) and moved across the membranes. We will focus now on purely communicating P systems, whereby objects are only transported between the neighboring compartments (including the environment). P systems with rules allowing objects to travel together from one compartment to one of its neighbors or simultaneously in opposite directions across the membrane, called symport/antiport P systems, are examples of such purely communicating P systems. These were introduced in Ref. [54]. Definition 3. A P system with symport and/or antiport rules (a symport/antiport P system, for short) is a construct Π = (O, T, E, μ, (w1 , R1 ), . . . , (wn , Rn , ), iout ), where n ≥ 1, O is the finite alphabet of objects; T ⊆ O is the alphabet of terminal objects; E ⊆ O is the set of objects in the environment which appear in infinitely countable number of copies; μ is a membrane structure of n membranes, where 1 indicates the skin membrane; wi ∈ V ∗ , 1 ≤ i ≤ n, is the initial contents (finite multiset) of region i; Ri is a finite set of rules associated to membrane i, 1 ≤ i ≤ n. The rules are of one of the forms (u, out; v, in) with u = λ, v = λ (antiport rule), (u, in), (u, out) with u = λ (symport rules,) where u, v ∈ O ∗ ; iout ∈ {1, . . . , n} is the label of an elementary membrane, called the output membrane. An antiport rule, (u, out; v, in), belonging to some rule set Ri , 1 ≤ i ≤ n, exchanges multiset u belonging to region i with multiset v from

page 271

August 2, 2021

272

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

the outside of region i. The symport rule (u, out) moves multiset u out of region i, to its parent region, and symport rule (u, in) imports u from the parent region into i. If the rules are associated with the skin membrane, then the parent region is the environment. In the case of symport rules, if the parent region of i is the environment, then the imported multiset must contain at least one symbol not in E (otherwise an infinite number of objects would cross the skin membrane into the system). Multisets wi , 1 ≤ i ≤ n, and the membrane structure μ represent the initial configuration of Π. At the beginning, the environment does not contain any element of O \ E. Often, symport/antiport rules are equipped with promoters and/or inhibitors, that is, finite multisets of objects that are permitting and/or forbidding context conditions for the application of the rules. In this case the objects are allowed to be transported between the regions, if the corresponding regions have (do not have) the promoter (inhibitor) multiset included in their current multiset of objects. Similar to other types of P systems, symport/antiport P systems compute by executing transitions among configurations. These configurations are tuples of multisets of objects belonging to the regions and the multiset of non-environmental objects (not belonging to E) that appear in the environment. These P system work usually in nondeterministic maximally parallel manner, but other execution modes have also been considered and studied (for more details, see Ref. [55]). A sequence of transitions starting with the initial configuration is a computation. The result of the computation in Π is the number of objects from T that can be found in region i0 when the system halts. If no terminal alphabet is distinguished, then the number of all objects in i0 at the end of a halting computation is the result of the computation. Instead of sets of numbers, sets of vectors of multiplicities of elements of T can also be considered as results. P systems with antiport and/or symport rules can be considered not only generating, but accepting devices as well. In this case, the input is the initial contents of a distinguished region. The symport/antiport

page 272

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

273

P system accepts the input if starting with this initial configuration it enters a halting configuration by a computation. The following very simple examples demonstrate how symport/ antiport P systems work. Let Π3 = ({S, a, b, c}, {a, b, c}, {a, b, c}, [1 ]1 , (Sabc, R1 )), 1), where R1 = {(S, out; Sabc, in), (S, out; abc, in)}. The computation starts with objects S, a, b, c in the skin region. As long as rule (S, out; Sabc, in) is applied the number of objects a, b, c increases and the multiset in the skin region will consist of equal number of objects a, b, and c. After applying rule (S, out; abc, in), the computation halts. The result is the set of numbers N (Π3 ) = {3n | n ≥ 2}. Let us consider another symport/antiport P system: Π4 = ({a}, {a}, {a}, [1 [2 [3 ]3 ]2 ]1 , (λ, R1 ), (λ, R2 ), (λ, R3 ), 1), where R1 = {(aaa, out; a, in)}, R2 = {(a, in)} and R3 = {(aa, out), (aa, in)}. Let Π4 work in accepting mode. Let us consider that region 1 contains am , m ≥ 1, as input. By applying rule (aaa, out; a, in), in every (non-deterministic maximally parallel) computation step am is divided by 3. If m = 3n , then after a while, in the last computation step, one copy of a enters region 2. If rule (a, in) of the second region is applied before the last step, then at least two copies of a will be present in region 2 at the same time after a while. Then, regions 2 and 3 will be involved in an infinite communication process, thus, the computation will never halt.Therefore, Π4 accepts the set of numbers {3n | n ≥ 0}. It was proved that P systems with symport/antiport rules are computationally complete devices.54 Furthermore, for any register machine, a symport/antiport P system with only one region (one membrane) can be constructed such that it simulates the register machine.56–58 Extensive investigations showed that symport/antiport P systems with small size parameters are as powerful as register machines.59–61 These parameters are, for example, the number of regions, the number of rules in the system, the number of objects in the multisets of the rules. Like in the case of Turing

page 273

August 2, 2021

274

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

machines, universal antiport P systems can be constructed. Moreover, their size parameters are bounded by constants.62 Recall that the result of the computation of symport/antiport P systems is considered in the usual manner, the number of (terminal) objects or the set of vectors of (terminal) objects in the output region in a halting configuration. However, the computation can also be described by the sequence of multisets which enter the P system during its functioning. This observation led to developing the concept of a P automaton, introduced in Ref. [63]. In this case, the input sequences of multisets imported by the symport/antiport P system from the environment are mapped onto words over a finite alphabet, thus, form a language. The map is required to be “easily” computable, usually in linear space. A related notion, called an analyzing P system,64 uses the mapping that orders to every multiset the set of words which are permutations of all elements of the multiset. This mapping is usually denoted by fperm . The concept of a P automaton, realizing a combination of the concept of a P system and variants of classical automata, generated several interesting results. For example, P automata working in the non-deterministic maximally parallel manner and with mapping computable in linear space describe the class of context-sensitive languages,65 while using fperm mapping, it defines a class of languages of sublogarithmic space complexity.66 For more information the reader is referred to Refs. [55, 67]. 8.2.5. Tissue-like P systems with symport/antiport rules Definition 4. A tissue P system with symport/antiport rules of degree q ≥ 1 is a construct Π = (Γ, E, M1 , . . . , Mq , R, iout ), where Γ is a finite set of objects; E ⊆ Γ is the set of objects initially placed in the environment of the system, all of them available in an arbitrary number of copies; Mi , 1 ≤ i ≤ q, are finite multisets of objects initially placed in the q cells of the system; R is a finite set of

page 274

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

275

rules of the following forms: (i, u/λ, j), where 0 ≤ i = j ≤ q, u ∈ Γ+ (symport rules); (i, u/v, j), where 0 ≤ i = j ≤ q, u, v ∈ Γ+ (antiport rules); iout ∈ {0, 1, 2, . . . , q} is the output region. A symport rule (i, u/λ, j) is applicable if cell i contains multiset u of objects. When such a rule is applied, multiset u of objects is sent from region i to region j. An antiport rule (i, u/v, j) is applicable if cell i contains multiset u of objects and cell j contains multiset v of objects. When such a rule is applied, multiset u of objects is sent from region i to region j, and simultaneously, multiset v of objects is sent from region j to region i. The rules of a tissue P system with symport/antiport rules are used in a maximally parallel way, similar to other P systems. The concepts of configuration, halting configuration, transition, and computation of a tissue P system with symport/antiport rules are similar to those defined for a communication P system. If cell division rules are introduced into tissue P systems with symport/antiport rules, then such P systems are called tissue P systems with symport/antiport rules and cell division. In a similar way are defined tissue P systems with symport/antiport rules and cell separation, when cell division is replaced by cell separation. A division rule [ a ] i → [ b ] i [ c ] i (a, b, c ∈ Γ+ ) is applicable to a configuration at an instant, if cell i contains object a and this cell is not the output cell. When such a rule is applied, cell i is divided into two cells with the same label: in the first copy, object a is replaced by object b, in the second one object a is replaced by object c, and all the objects in the original cell, different from the object a triggering the rule, are replicated in the two new cells. A separation rule [ a ] i → [ Γ0 ] i [ Γ1 ] i (a ∈ Γ+ , Γ0 , Γ1 are nonempty sets such that Γ0 ∪ Γ1 = Γ and Γ0 ∩ Γ1 = ∅) is applicable to a configuration at an instant, if there is a cell i which contains object a and i = iout . When such a rule is applied, object a is consumed, two new cells with label i are generated, and all the objects in the original cell, different from the object a triggering the rule, are distributed in the two new cells.

page 275

August 2, 2021

276

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

Tissue P systems with symport/antiport rules were first proposed in Ref. [68], and then the model has been extended in Ref. [69] by adding a notion of state to communication channels, which can be modified when a communication rule is applied. If rules used involve objects from one or two regions, there have been proposed tissue P systems with conditional uniport 70 or evolution-communication.71, 72 If rules used involve objects from four regions (possibly one of them is the environment), two acting as inputs and two as outputs, generalized communicating P systems were proposed.73–76 Such a purely communicating P system is a network of cells where the nodes are labeled and at any step of functioning contain a finite multiset of objects. The rules of this type of P systems simultaneously move symbols from two regions into other two. This model was inspired by both the symport/antiport paradigm and the way transitions of the Petri nets fire tokens coming from input places and then send them out to output places.77 Tissue P systems with evolutional symport/antiport rules were proposed in Ref. [78], where objects are moved between cells or between a cell and the environment, and may evolve during this process. A concept similar to tissue P system with evolutional symport/antiport rules is the P colony, introduced in Ref. [79]. These constructs are variants of very simple tissue-like P systems, where the cells (also called agents) have only one region and interact with their joint environment by using programs (collections of rules of a special form). Inside each cell (agent) there is a finite multiset of objects. These objects are processed by a finite set of programs associated to the agent. The number of objects inside each agent is the same and it does not change during the functioning of the P colony. The agents share an environment which is represented by a multiset of objects. One type of these objects, called the environmental object, is distinguished, and it is supposed to be in a countably infinite number of copies in the environment. Using their programs, the agents can change the objects present at their disposal and can exchange some of their objects with objects present in the environment. These synchronized actions correspond

page 276

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

277

to a configuration change (a transition) of the P colony; a finite sequence of consecutive configuration changes starting from the initial configuration is a computation. The result of the computation is the number of copies of a distinguished object, called the final object, present in the environment in a final configuration of the P colony. P colonies with small size parameters are computationally complete computing devices, and have several applications in different research areas.10 In Refs. [80, 81], the notion of flat maximal parallelism was first proposed, where in each membrane, a maximal set of applicable rules is chosen and each rule in the set is applied exactly once in each step. The flat maximal parallelism was also considered in tissue P systems with promoters.82 Inspired from the living cell, several ways of generating new cells have been considered: cell division and cell separation, which make it possible to generate an exponential workspace, and can solve computationally hard problems in a feasible time. In Ref. [83], tissue P systems with cell division are introduced, and the SAT problem is solved (see Refs. [84,85] for more details and solutions to different NP-complete problems using these systems). Tissue P systems with cell separation were proposed in Ref. [86], and computational complexity theory aspects were investigated. In Ref. [87], tissue P systems with evolutional symport/antiport rules and cell separation were introduced, and computational complexity theory aspects were investigated. There are several other variants of tissue P systems, for instance, introducing promoters,88 energy,89 proteins,90, 91 etc. An example is given to illustrate how a tissue P system with symport/antiport rules works. Let Π5 = ({a, b}, {a, b}, {a}, ∅, R, 1) be a tissue P system with symport/antiport rules (having two cells with labels 1 and 2, respectively, the environment has the label denoted by 0), where R r1 : (1, a/ab2 , 0) and r2 : (1, a/λ, 2). The computation for Π5 is described as follows. Initially, cell 1 has one copy of object a, the environment has arbitrary number of copies of objects a, b. At step 1, either rule r1 or rule r2 is used nondeterministically. If rule r2 is used, object a is sent to cell 2 and

page 277

August 2, 2021

278

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

the system halts; if rule r1 is used, one copy of object a is sent to the environment, meanwhile, one copy of object a and two copies of object b are brought into cell 1. Obviously, the number of objects b in cell 1 is increased by two at each step as long as rule r1 keeps applying. Only when rule r2 is applied, object a is sent to cell 2, then the system halts. Hence 2n (n ≥ 0) copies of object b are produced in cell 1. 8.2.6. Spiking neural P systems Definition 5. A spiking neural P system (shortly, SN P system) of degree m ≥ 1, as introduced in Ref. [92], is a construct of the form Π = (O, σ1 , . . . , σm , syn, out), where O = {a} is the singleton alphabet (a is called spike); σ1 , . . . , σm are neurons; syn ⊆ {1, 2, . . . , m} × {1, 2, . . . , m} are synapses between neurons, with (i, i) ∈ / syn for 1 ≤ i ≤ m; and in, out ∈ {1, 2, . . . , m} indicate the input and output neurons, respectively. Each neuron σi , 1 ≤ i ≤ m, has the form (ni , Ri ), where (a) ni ≥ 0 is the initial number of spikes contained in the neuron; (b) Ri is a finite set of rules of the following form: E/ac → ap ; d, where E is a regular expression over {a}, c ≥ 1, c ≥ p ≥ 0, and d ≥ 0 is the delay, that is, the interval between applying the rule and releasing the spike(s). A rule E/ac → ap ; d is called an extended rule; if p = 1, then the rule E/ac → a; d is called a standard spiking rule; if p = 0, then the rule E/ac → λ is called a forgetting rule, thus extended rules are a generalization of both standard spiking rules and forgetting rules. A rule E/ac → ap ; d in neuron σi is enabled if the following two conditions are satisfied: (1) the content of neuron σi is described by the regular expression E associated with the rule, for example, if neuron σi contains k spikes, then ak ∈ L(E); (2) the number of spikes in neuron σi is not less than the number of spikes consumed by the rule, that is, k ≥ c. The application of this rule means that neuron σi consumes c spikes, and produces p copies of spikes after a delay

page 278

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

279

of d steps. If d = 0, then the produced p spikes emitted by neuron σi are replicated and sent immediately to all neurons σj such that (i, j) ∈ syn. If d ≥ 1, assume that the rule is applied at step t, then at steps t, t + 1, . . . , t + d − 1, the neuron is closed, so that it cannot receive new spikes from neurons which have synapses to the closed neuron (if any of these neurons tries to send a spike to the closed neuron, then the particular spike is lost). Because two enabled rules E1 /ac1 → ap1 ; d1 and E2 /ac2 → ap2 ; d2 may have E1 ∩ E2 = ∅, it is possible that there is more than one rule that can be applied in a neuron at some moment. In this case, the common strategy adopted is that one of the enabled rules is non-deterministically chosen to be executed. If the rule is a forgetting one of the form E/ac → λ, then when it is applied, c spikes are removed from the neuron. The configuration of an SN P system at a given time is described by the number of spikes present in each neuron and the number of steps to count down until it becomes open (this number is zero if the neuron is already open). That is, the configuration of system Π is of the form r1 /t1 , . . . , rm /tm for ri ≥ 0 and ti ≥ 0, where ri indicates that neuron σi contains ri spikes, and it will become open after ti steps, i = 1, 2, . . . , m. With this notation, the initial configuration of system Π is n1 /0, . . . , nm /0 . By using the rules as described above, one can define a sequence of consecutive configurations. Each passage from a configuration C1 to a successor configuration C2 is called a transition and denoted by C1 =⇒ C2 . Any sequence of transitions starting from the initial configuration constitutes a computation. A computation halts if it reaches a configuration where all neurons are open and no rule is enabled. With any computation, halting or not, one associates a spike train, that is, a binary sequence with occurrences of digit 1 (respectively, 0) indicating that the output neuron sends one spike (respectively, no spike) out of the system. The result of a computation can be defined in several ways. In the field of SN P systems, the result of a computation is always defined as the time interval between the first two spikes being emitted for any spike train containing at least two spikes.93, 94 Besides, one can also consider the result of a computation as the total number of

page 279

August 2, 2021

280

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

spikes emitted to the environment by the output neuron when the computation halts.95 Moreover, the result of a computation can be also defined as the spike train itself. In this way, an SN P system is used as a binary string language generator defined on the binary alphabet {0, 1}.96 With the inspirations of different biological phenomena, various new types of SN P systems were proposed, such as SN P systems with astrocytes inspired by the excitatory and inhibitory functioning of astrocytes on synapses97 ; SN P systems with weights which are inspired by the fact that a biological neuron can fire only when its membrane potential reaches or exceeds its threshold potential98 ; SN P systems with polarizations by taking polarized cell membrane of a neuron as inspiration99 ; axon P systems motivated by the information processing function of an axon in a nervous system100 ; coupled neural P systems inspired by Eckhorn’s neuron model.101 Moreover, with mathematical and computer science motivations, a number of new types of SN P systems were also put forward, for example, SN P systems with structural plasticity by drawing the idea from the self-organizing and self-adaptive features of artificial neural networks102, 103 ; cell-like SN P systems by incorporating the feature of hierarchical arrangement of neurons in cell-like P systems104 ; SN P systems with communication on request inspired by the requestresponse pattern in parallel communicating grammar systems.105 An example is given to illustrate the previous definition and function of SN P systems. Specifically, an SN P system Π6 shown in Figure 8.2 consists of three neurons σ1 , σ2 , and σ3 , and neuron σ3 is the output one. The SN P system Π6 functions as follows. All neurons fire at the first step, with neuron σ2 choosing non-deterministically between its two rules. Thereinto, neuron σ1 can fire only if it contains four spikes; two spikes are consumed, the other two spikes remain available for the next step. Both neurons σ1 and σ2 send two spikes to the output neuron σ3 ; these four spikes are removed at the next step. Neurons σ1 and σ2 also exchange their two spikes; therefore, as long as neuron σ2 applies the rule a2 → a2 ; 0, neuron σ2 receives two spikes, thus completing the needed four spikes for firing again.

page 280

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

Figure 8.2.

b4205-v1-ch08

281

An SN P system Π6 generating all natural numbers greater than 1.

If at some step (it is possible at the first step), neuron σ2 chooses to apply the rule a2 → a2 ; 1, then the two spikes of neuron σ1 cannot enter neuron σ2 , they are only sent to neuron σ3 ; in this way, neuron σ2 will never work again because it remains empty. At the next step, neuron σ1 has to apply its forgetting rule a2 → λ, while neuron σ3 simultaneously fires by using the rule a2 → a; 1. Meanwhile, neuron σ2 emits its two spikes, but they cannot enter neuron σ3 , because it is closed at this step; the two spikes enter neuron σ1 , but they are removed at the next step. Ultimately, the computation halts with the spike emitted from neuron σ3 , and no spike remains in the system. Because of the waiting moment imposed by the rule a2 → a; 1 in neuron σ3 , the two spikes of this neuron cannot be consecutive, but at least two steps must exist in between. Consequently, it can be concluded that the SN P system Π1 generates all natural numbers greater than 1. 8.2.7. Enzymatic numerical P systems Definition 6. An enzymatic numerical P system (shortly, ENP system) of degree m ≥ 1, as introduced in Ref. [106], is a construct of the form Π = (m, H, μ, (Var1 , E1 , P r1 , Var1 (0)), . . . , (Varm , Em , P rm , Varm (0))),

page 281

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

282

where m ≥ 1 is the number of membranes; H is an alphabet that contains m symbols; μ is a membrane structure with m membranes labeled injectively by 1, 2, . . . , m; each membrane i is characterized by a four-tuple (Vari , Ei , P ri , Vari (0)), 1 ≤ i ≤ m, where (a) Vari is a finite set of variables from compartment i; (b) Ei ⊆ Vari is a finite set of enzyme variables from compartment i; (c) P ri is a set of programs from compartment i; and (d) Vari (0) is the set of initial values of the variables from compartment i. Each program has one of the following forms: (i) non-enzymatic form: Fl,i (x1,i , . . . , xki ,i ) → cl,1 | v1 + · · · + cl,ni | vni , where Fl,i (x1,i , . . . , xki ,i ) is the production function, and cl,1 | v1 + · · · + cl,ni | vni is the repartition protocol; (ii) enzymatic form: Fl,i (x1,i , . . . , xki ,i ) |ej,i → cl,1 | v1 + · · · + cl,ni | vni , where ej,i is an enzyme variable from Vari different from x1,i , . . . , xki ,i , and from v1 , . . . , vni . The non-enzymatic programs of type (i) are exactly like the ones from the standard numerical P systems (for short, NP systems).107 The non-enzymatic programs are executed as follows. At time t ≥ 0, the system computes a production value Fl,i (x1,i (t), . . . , xki ,i (t)) by taking the current values of variables x1,i , . . . , xki ,i belonging to compartment i. Afterwards, according to the distribution coefficients cl,1 , . . . , cl,ni , the production value Fl,i (x1,i (t), . . . , xki ,i (t)) is distributed to each variable v1 , . . . , vni from compartment i, the upper compartment, or the immediately inner compartments. Specifically, each variable vs is assigned a value ql,i (t) · cl,s , 1 ≤ s ≤ ni , where ql,i (t) is computed as follows: ql,i (t) =

Fl,i (x1,i (t), . . . , xki ,i (t)) n i . s=1 cl,s

Notably, the variables involved in the production function are reset to zero after computing the production value; otherwise, they retain the current value. After repartition, the quantities assigned to each variable from several repartition protocols are added to the current value of these variables. If there are multiple programs that can be executed in a compartment at some moment, then the common strategy adopted is that all these applicable programs are

page 282

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

Membrane Computing Concepts

283

simultaneously executed, named the all-parallel mode. The programs can also be applied in the one-parallel mode or sequential mode. For more detail, see Refs. [107, 108]. An enzymatic program Fl,i (x1,i , . . . , xki ,i ) |ej,i → cl,1 | v1 + · · · + cl,ni | vni can be executed at a time t only if ej,i > min{x1,i (t), . . . , xki ,i (t)}. In this sense, enzyme variables are used to control the application of programs. The execution of the enzymatic programs is identical to that of the enzymatic programs. The configuration of the ENP system Π at each time t ≥ 0 is described by the values of all the variables of the system, represented by Ct = x1,1 (t), . . . , xk1 ,1 (t), . . . , x1,m (t), . . . , xkm ,m (t) , where the coordinate xj,i (t) ∈ R indicates the value of variable xj,i from compartment i at current time t, for 1 ≤ i ≤ m, 1 ≤ j ≤ ki , and t ≥ 1. Consequently, the initial configuration of the system is written as C0 = x1,1 (0), . . . , xk1 ,1 (0), . . . , x1,m (0), . . . , xkm ,m (0) . The transition of system Π is defined as from one configuration to another one, computed as follows: given the values of variables xj,i (t) at time t, the values of variables xj,i (t+1) at time t+1 is computed by applying the corresponding programs. A sequence of such transitions forms a computation. If no applicable set of programs produces a change on the current configuration, then the system is said to reach a halting configuration. The result of a computation is defined as the values of the designated variables during the computation. An example is given to illustrate the previous definition and the way in which ENP systems work. To be specific, an ENP system Π7 shown in Figure 8.3 has one membrane, holding the initial values specified in square brackets for each variable and programs inside, and the variable x1 is designated as the output one.

Figure 8.3.

An ENP system Π7 generating Fibonacci sequence.

page 283

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

284

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

The ENP system Π7 works as follows. At each step t ≥ 0, the value of enzyme variable e1 is increased by the value x1 (t), thus e1 (t) > min{x1 (t), x2 (t)} is always true; in this way, programs (x1 + x2 ) |e1 → 1 | x1 and 2x1 → 1 | x2 + 1 | e1 are applied all the time. Therefore, at each step t, the value of variable x2 becomes the the value x1 (t), while the value of variable x1 becomes the value x1 (t) + x2 (t). As the variable x1 is the output one, the value of variable x1 during the computation is the result of a computation, thus it can be easily checked that the ENP system Π7 generates Fibonacci sequence. By extending some “general” notions in membrane computing to NP systems, various types of NP systems were proposed. For example, by drawing the idea of the threshold control adopted in SN P systems, several other control strategies of the execution of programs have also been introduced in NP systems, such as NP systems with the variable threshold control,109 NP systems with the production threshold control,110 and NP systems with boolean condition control.111 Inspired by the communication of objects through the membranes in the initial membrane computing models, NP systems with migrating variables were put forward.112 8.3. Computing Power and Computational Efficiency 8.3.1. Introduction When studying computing models within the framework of a given computing paradigm, there are two important topics to be analyzed: computing power and computational efficiency. The first topic is also called computational completeness or universality and it consists in determining which syntactic and/or semantic features are necessary for the models to be equivalent in power to Turing machines. The computational efficiency of a computing model refers to the ability of the model to provide polynomial time solutions to computationally hard problems. In this case, usually, by making use of an exponential workspace constructed/created in linear time. At the end of 1998, the first foundations of a new computing paradigm, called Membrane Computing, inspired by some basic

page 284

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

285

biological features of living cells, as well as in the cooperation of cells in tissues, organs and organisms, were introduced by Gh. P˘ aun.1 The seminal paper was focused on the study of computational completeness of the models, generically called membrane systems, considered: transition P systems, P systems based on rewriting and splicing P systems. In particular, a transition P system consists of a collection of unit processors, called membranes, hierarchically structured by means of a rooted tree. The membranes delimit regions containing multisets of objects which can evolve according with some prefixed rewriting rules, being applied in a non-deterministic an maximally parallel way. Precise definition of a transition P system and an example are available from Section 8.2.2, Definition 1. In membrane computing there are, basically, two ways to consider computational devices: cell–like membrane systems (P systems) and tissue-like membrane systems (tissue P systems). The first one, using the biological membranes arranged hierarchically, inspired from the structure of the cell, and the second one using the biological membranes (called cells in this approach) placed in the nodes of a graph, inspired from the cell inter–communication in tissue. Throughout this section the term membrane system will be used to refer to indistinctly to a cell-like or a tissue-like P system. Aspects related to the computational efficiency of membrane systems were first analyzed in 1999 with the introduction of a new computing model, called P system with active membranes 41 — see Definition 2 in Section 8.2.3. These systems are non-cooperative (the left-hand side of any rule consists of only one object) and their membranes play a relevant role in computations to the extent that new membranes can be created by division rules. The membranes of these systems are supposed to have an electrical polarization: positive, negative or neutral. In this context, an ad-hoc solution to the satisfiability problem (SAT) by means of such kind of P systems, was given. Specifically, a P system with active membranes which makes use of simple object evolution rules (only one object is produced for this kind of rules), dissolution rules and division rules for elementary and non-elementary membranes, is associated with every instance ϕ of SAT. Thus, the syntactic structure of the formula is “captured” by

page 285

August 2, 2021

286

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

the description of the system and, furthermore, in this context a P system can only process one instance of the problem. The provided solution runs in linear time with respect to the input length of ϕ, that is, the maximum of number of variables and number of clauses of the formula ϕ. In Ref. [113], a similar solution ad-hoc to the SAT problem by means of P system with active membranes but without dissolution rules which makes use of division rules only for elementary membranes, was provided. In this situation, different instances of the SAT problem having the same number of variables and the same number of clauses, will be processed by different membrane systems. In order to define in a formal way what solving a decision problem means, basic recognizer transition P systems (initially called decision P systems) were defined.114 In this context, the computational efficiency of this kind of membrane systems was studied. Let us recall that an abstract problem can be solved by using a single Turing machine, that is, for every instance of the problem the Turing machine with input that instance returns the correct answer. This is due to the fact that these machines have an unlimited and unrestricted memory since its tape is infinite (consists of an infinite number of cells). Bearing in mind that the ingredients necessary to define a membrane system are finite, an abstract problem should be solved, in general, by a numerable family of membrane systems in such a manner that each one of them is in charge of processing all the instances with the same size. However, some decision problems can be solved by means of a single membrane system.

8.3.2. Computing power A first natural research direction related to membrane systems is the investigation of their computing power, and how various features (associated with the rules, the membrane structure or the objects) contribute to it. Let us denote by M Pm (α) the family of sets of natural numbers generated by multiset P systems of degree at most m ≥ 1 (when the degree of the system is not bounded, then m is replaced by ∗), where α = Coop means that cooperative (i.e., context sensitive) rules

page 286

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

287

can be used, while α = nCoop means that only non cooperative (i.e., context free) rules can be used. Directly from the definitions, we have the following relations: Lemma 1. M Pm (α) ⊆ M Pm+1 (α), m ≥ 1. M Pm (nCoop) ⊆ M Pm (Coop), m ≥ 1. By recalling the Turing–Church thesis, or by an explicit construction by means of, for example, a Turing machine, we can prove Lemma 2. M P∗ (α) ⊆ N RE. It is known115 that when we use only non-cooperative rules, then the hierarchy collapses at level one: Lemma 3. M P∗ (nCoop) ⊆ M P1 (nCoop) ⊆ N CF. On the contrary, the use of cooperative rules in the framework of membrane systems turns out to be as powerful as the Turing machines, as proved already in the seminal paper by P˘aun.1 A characterization of N RE has also been obtained by simulating another class of simple computational devices, namely register machines.116 Such a simulation is useful for various reasons, for instance because it can be adapted to give various minimization results (e.g., on the number of used symbols or rules) or to characterize subclasses of languages in N RE, by considering proper restricted variants of register machines. Register machines are a model of computation consisting of a constant number m of registers, which contain arbitrary large natural numbers. A register machine operates according to a fixed number n of instructions (labeled using integer numbers) of three different types: increment, decrement (with zero test) and halt. A counter is used to keep track of the label of the instruction to execute. Some registers of the machine are designated to contain the input, while a specified register (we can always choose the first one) contains the output value. The register machine starts computing from its first instruction until a halting instruction is executed. The result is the value of the output register.

page 287

August 2, 2021

288

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

Consider a register machine R, having m registers r1 , r2 , . . . , rm , and n instructions p1 , p2 , . . . , pn and an arbitrary configuration (pi , v1 , v2 , . . . , vm ), where pi denotes the instruction to be executed and v1 , v2 , . . . , vm are the values stored in registers. A configuration can be encoded in a P system by using an alphabet of size m + n, through a multiset (Pi , R1v1 , R2v2 , . . . , Rnvn ), where Pi is used to store the instruction to execute, and each object Ri is present in a number of copies corresponding to the value of the corresponding register ri . An increment instruction pi : inc(rh ), pj is implemented by the rewriting rule Pi → Rh Pj . The object Pi is replaced by two objects: Rh (thus simulating the increment of register rh ) and Pj (preparing the object necessary to simulate the next instruction pj ). Since a decrement instruction pi : dec(rh ), pj , pk need to execute the zero test (checking whether or not register rh contains the value zero), the simulation requires, in this case, more steps. First, the symbol Pi is replaced through the rule Pi → Pi Dh . If Rh is greater than zero, the symbol Dh can now be used to delete an occurrence of Rh , thus correctly simulate the decrement. At the same time, the object Pi must wait one computation step. These are obtained by the rules Dh Rh → Dh , Pi → Pi . While the object Pi is surely now present, we have two different possibilities for Dh : if the value of register rh was greater than zero, then a corresponding object Rh has been deleted and Dh transformed in Dh , by means of the rule Dh Rh → Dh . On the contrary, if the value of the register was zero, then it was not possible to apply this last rule, and the object Dh is still present as a non–primed symbol. We can thus produce the correct symbol corresponding to the instruction to be simulated next, using the rules Pi Dr → Pj , Pi Dr → Pk . When Pi corresponds to a halting instruction, we can simply omit it from the right side of the rule, so computation stops. The output is the number of symbols corresponding to the output register. From a formalization of this construction, we obtain Theorem 1. M P∗ (Coop) = M Pm (Coop) = N RE, m ≥ 1.

page 288

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

289

By modifying the construction in an appropriate way, the number of used symbols can be reduced by separating the registers in different regions, using more membranes. The value of each register ri is stored using the multiplicity of a single object a inside the corresponding membrane ri . The object Pi is sent to the corresponding membrane, to increment or decrement the amount of the register. If, moreover, the use of polarizations is allowed to control the movement of Pi through membrane, then cooperative rules can also be replaced by non-cooperative ones (by exploiting polarizations to implement the zero test). We refer the reader to Refs. [38, 117, 118] for further details. 8.3.2.1. Rewriting membrane systems The basic variant of P systems uses objects of an atomic type to store information. Each object is a unique entity; the application of an evolution rule determines the substitution of the object with a different one. Nonetheless, one can also consider the use of complex molecules to store the information, such as DNA or proteins. In this case, the action of the chemical compound on such a molecule leads to a molecule which differ only in some parts, with respect to the starting ones. Objects are still intended as a whole, but we can consider their complete structure. Such a variant was introduced already in Ref. [1] as Rewriting P system (or RP systems): objects are described by finite strings over a finite alphabet, and they evolve by means of context-free rewriting rules. Each string in the system is processed by one rule only at a time; the parallelism of the system consists in processing simultaneously all available strings by all applicable rules. If several rules can be applied to a string at the same time, then only one rule is non– deterministically chosen to be applied. We denote by RPm (α), m ≥ 1, the family of languages generated by rewriting P systems with at most m membranes; if α = P ri, then the system can make use of priority over evolution rules (otherwise,

page 289

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

290

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

α = nP ri). We can also consider systems in which a terminal alphabet (or output alphabet) is added, and the output consists of strings containing only terminal symbols. We obtain, in this case, Extended Rewriting P systems or ERP systems. The family of languages generated by ERP systems with at most m membranes is denoted by ERPm (α), m ≥ 1, where, as in the previous case, α ∈ {P ri, nP ri}. The Power of Rewriting Membrane Systems. Some preliminary results concerning rewriting P systems follow directly from the definitions: Lemma 4. RPm (α) ⊆ ERPm (α), for m ≥ 1 and α ∈ {P ri, nP ri}. RPm (α) ⊆ RPm+1 (α), for m ≥ 1 and α ∈ {P ri, nP ri}. ERPm (α) ⊆ ERPm+1 (α), for m ≥ 1 and α ∈ {P ri, nP ri}. The power of rewriting membrane systems has been the subject of various research works. In Ref. [1], it was proved that rewriting P systems with a single membrane and no priority exactly characterize the family of CF languages. When two membranes are used, non-CF languages can be generated: Theorem 2. CF = RP1 (nP ri). CF ⊂ RP2 (nP ri). While adding a membrane to systems with a single membrane results in an increased generative power, an unlimited number of membranes does not lead to universality, even when the use of a terminal alphabet is considered. In fact, in Ref. [119] it was shown that such systems exactly characterize the class of languages M AT , that is the class of languages generated by λ-free context-free matrix grammars.120 Theorem 3. M AT = ERP4 (nP ri) = ERP∗ (nP ri) ⊂ RE. In order to obtain Turing powerful rewriting P systems we need to make use of the other features, such as priority relations. In Ref. [121],

page 290

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

291

the following result was proved, showing that two membranes are enough to obtain universality when priorities are allowed: Theorem 4. RE = RP2 (P ri). Priorities can be avoided by considering different kinds of restrictions, which are well-known in formal language theory. Some examples in this directions have been presented, for instance, in Refs. [119,122], by considering leftmost derivations (a string is always rewritten in the leftmost position which can be rewritten by a rule from its region), or forbidding/permitting conditions (a rule can be applied only to strings that do not contain/do contain some specific symbols). It turned out that while leftmost derivation and forbidding conditions result in an increased computational power, quite surprisingly this is not true for permitting conditions. Let us denote by left, forb, and perm the use of the features just described. The following results have been obtained: Theorem 5. RE = RP4 (lef t). M AT ⊂ RP2 (lef t). RE = RP2 (f orb) = RP∗ (f orb) = ERP2 (f orb) = ERP∗ (f orb). M AT = ERP∗ (perm) = ERP2 (perm). 8.3.3. Computational efficiency The first foundations of a computational complexity theory in membrane computing were given in Refs. [114, 123]. In the seminal paper on membrane computing, the models defined are (cell-like) P systems with output membrane but without input membrane. A single initial configuration given by the initial multisets over the working alphabet, is associated with such P systems. In Ref. [114], P systems with input membrane were introduced. These systems have a distinguished membrane (the input membrane) and an input alphabet Σ (strictly contained in the working alphabet Γ) in such manner that the initial multisets of the system are multisets over Γ \ Σ. Any P system Π with input membrane has many different

page 291

August 2, 2021

292

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

initial configurations, one for each multiset m over Σ (the multiset m is added to the initial multiset associated with the input membrane). In this case, the system Π with input multiset m is denoted by Π+m. 8.3.3.1. Recognizer membrane systems Bearing in mind that the solvability of decision problems is defined through the recognition of languages, recognizer membrane systems are introduced in the framework of Membrane Computing. Definition 7. A recognizer membrane system is a membrane system (with or without input membrane) such that: (a) the working alphabet Γ contains two distinguished elements yes and no; (b) in the case of a P system without input membrane, the initial multisets are multisets over Γ, but in the case of a P system with input membrane, the initial multisets are multisets over Γ\Σ, being Σ the input alphabet of the system; (c) all computations halt; and (d) if C is a computation of the system then either object yes or object no (but not both) must have been sent to the output region of the system, and only at the last step of the computation. For recognizer P systems, a computation C is said to be an accepting computation (respectively, rejecting computation) if the object yes (respectively, no) appears in the output region associated with the corresponding halting configuration of C. Bearing in mind that every computation in a recognizer membrane system is a halting computation, the left-hand side of any rule of the system must contain at least one object. These concepts are extended in a natural way to tissue-like P systems inspired by cell inter-communication in tissues. Specifically, if Γ, Σ and E are the working alphabet, the input alphabet and the alphabet of the environment, respectively, then E ⊆ Γ\Σ and the initial multisets of a tissue-like P system are multisets over Γ\Σ. 8.3.3.2. Polynomial time complexity classes The first results showing that membrane systems could solve computationally hard problems in polynomial time were obtained using

page 292

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

293

P systems without input membrane.41, 113, 124 This kind of solutions can be considered as special purpose solutions: a specific P system is associated with each instance of the problem in such manner that the syntax of the instance is part of the description of the P system. Semi-uniform solutions. In Ref. [114], the special purpose solutions are defined in a mathematical way, called semi-uniform solutions. Definition 8. Let X = (IX , θX ) be a decision problem and let R be a class of recognizer membrane systems “without” input membrane. We say that X is solvable in polynomial time and “semi-uniform” way by a family {Π(u) | u ∈ IX } of systems from R, denoted by X ∈ PMC∗R , if the following holds: • The family is polynomially uniform by Turing machines, that is, there exists a deterministic Turing machine working in polynomial time which constructs the system Π(u) from the instance u ∈ IX . • The family is polynomially bounded; that is, there exists a natural number k ∈ N such that for each instance u ∈ IX , every computation of Π(u) performs at most |u|k steps. • The family is sound with respect to X, that is, for each instance u ∈ IX , if there exists an accepting computation of Π(u) then θX (u) = 1. • The family is complete with respect to X, that is, for each instance u ∈ IX , if θX (u) = 1 then every computation of Π(u) is an accepting computation. According with the previous definition: • We say that the family {Π(u) | u ∈ IX } provides a semi–uniform solution to the problem X. • For each instance u ∈ IX , the system Π(u) processes u. Besides, from the soundness and completeness of the family with respect to the decision problem X it follows that the system Π(u) is confluent, in the sense that all computations must give the same answer: either all computations are accepting computations or all computations are rejecting computations.

page 293

August 2, 2021

294

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

Uniform solutions. Another kind of solutions to decision problems by means of families of recognizer membrane systems is introduced. In this context, all instances of the problem with the same size, via a given “reasonable encoding scheme”, are processed by the same system to which an appropriate input is supplied. Definition 9. Let X = (IX , θX ) be a decision problem and let R be a class of recognizer membrane systems “with” input membrane. We say that X is solvable in polynomial time and “uniform” way by a family {Π(n) | n ∈ N} of systems from R, denoted by X ∈ PMCR , if the following holds: • The family is polynomially uniform by Turing machines, that is, there exists a deterministic Turing machine working in polynomial time which constructs the system Π(n) from the number n ∈ N, expressed in unary. • There exists a pair (cod, s) of polynomial–time computable functions over IX such that for each n ∈ N, the set s−1 (n) is finite, and for each u ∈ IX , s(u) ∈ N and cod(u) is an input multiset of the system Π(s(u)). • The family is polynomially bounded with respect to (X, cod, s); that is, there exists k ∈ N such that for each u ∈ IX , every computation of the system Π(s(u)) + cod(u) performs at most |u|k steps. • The family is sound with respect to (X, cod, s), that is, for each u ∈ IX , if there exists an accepting computation of Π(s(u)) + cod(u) then θX (u) = 1. • The family is complete with respect to (X, cod, s), that is, for each u ∈ IX , if θX (u) = 1 then every computation of Π(s(u)) + cod(u) is an accepting computation. According with the previous definition: • We say that the family {Π(n) | n ∈ N} provides a uniform solution to the problem X and the ordered pair (cod, s) is a polynomial encoding from the problem X to the family {Π(n) | n ∈ N}. • For each instance u ∈ IX , the system Π(s(u)) processes u when the input multiset cod(u) is supplied to the corresponding input membrane. Besides, the system Π(s(u)) + cod(u) is confluent,

page 294

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

295

in the sense that all computations must give the same answer (either all computations are accepting computations or all computations are rejecting computations). As a direct consequence of working with recognizer membrane systems, these complexity classes are closed under complement. Moreover, it is easy to prove that they are closed under polynomialtime reductions.125 Obviously, every uniform solution of a decision problem can be considered as a semi-uniform solution using the same amount of computational resources. That is, PMCR ⊆ PMC∗R , for any class R of recognizer P systems. It has been proved that the concept of uniformity solution is strictly weaker than semi-uniformity solution, for some membrane systems.126 8.3.3.3. Limits on efficient computations In this section, the limitations of polynomial time computations in membrane systems whose underlying structure (a rooted tree in the case of cell-like approach and a directed graph in the case of tissuelike approach) does not “increase”, is analyzed. With respect to such kind of cell-like membrane systems, two interesting results were established127 : • Every deterministic Turing machine working in polynomial time can be simulated in polynomial time by a family of recognizer transition P systems. • If a decision problem is solvable in polynomial time by a family of recognizer transition P systems, then there exists a deterministic Turing machine solving it in polynomial time. Consequently, only problems in class P can be efficiently solved by basic recognizer transition P systems. This result has been extended to recognizer tissue P systems which only use communication rules. In fact, has been shown that recognizer tissue P systems with symport/antiport rules can be simulated (in an efficient manner) by means of basic recognizer transition P systems (see Ref. [128] for details).

page 295

August 2, 2021

296

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

From the previous result we deduce that the ability of a membrane systems to create exponential workspace (in terms of number of objects) in polynomial time is not enough to efficiently solve NP-complete problems (assuming that P = NP). 8.3.3.4. Solving computationally hard problems According with the previous section, in order to provide polynomial time solutions to computationally hard problems, it is necessary to consider membrane systems able to increase the number of processor units (membranes/cells) during a computation. Specifically, the membrane systems should have the ability of trading space for time by providing an exponential workspace created in linear time. This capability has been implemented by using different mechanisms inspired by the cellular mitosis (division rules), autopoiesis (creation rules) or membrane fission (separation rules), among others. Cell-like membrane systems. P systems with active membranes is a kind of cell-like membrane system having associated electrical charges with membranes, and they were first introduced by Gh. P˘ aun.41 This kind of non-cooperative systems incorporate membrane division rules, and the polarization of a membrane, but not the label, can be modified by the application of a rule. Let us denote by AM (respectively, AM(+ne) or AM(−ne)) the class of recognizer P systems with active membranes using division rules (for elementary and non-elementary membranes or only for elementary membranes, respectively). In the framework of AM(−ne), polynomial time solutions to weakly NP-complete problems (Knapsack,129 Subset Sum,130 Partition,131 ) and strongly NP-complete problems (SAT,123 Clique,132 Bin Packing,133 Common Algorithmic Problem134 ) were given. It has been shown135 that the quantified Boolean formula (QBF-SAT) problem can be solved in a linear time by a family of recognizer P systems with active membranes (without using dissolution rules) and using division rules for elementary and non-elementary membranes. In Ref. [136], a (deterministic and efficient) algorithm simulating a single computation of any confluent recognizer P system

page 296

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

297

with active membranes, has been described. Such P systems can be simulated by a deterministic Turing machine working with exponential space, and spending a time of the order O(2p(n) ), for some polynomial p(n). Thus, PSPACE ⊆ PMCAM(+ne) ⊆ PMC∗AM(+ne) ⊆ EXP. In Ref. [137], the complexity class PSPACE has been characterized by PMCAM(+ne) . Previous results show that the usual framework of P systems with active membranes for solving decision problems is too powerful from the computational complexity point of view. For that, polarizationless P systems with active membranes are considered and AM0 (α, β) will denote the class of all recognizer polarizationless P systems with active membranes such that: (a) if α = +d (respectively, α = −d) then dissolution rules are permitted (respectively, forbidden); and (b) if β = +ne (respectively, −ne) then division rules for elementary and non-elementary membranes (respectively, only for elementary membranes) are permitted. It is worth pointing out the relevant role played by dissolution rules in the framework of AM0 (α, β) from a complexity view. On the one hand, has been proved138 that P = PMCAM0 (−d,−ne) = PMCAM0 (−d,+ne) . On the other hand, a polynomial time solution for the QBF-SAT problem by means of membrane systems from AM0 (+d, +ne), has been given,139 that is, PSPACE ⊆ PMCAM0 (+d,+ne) . Tissue-like membrane systems. We denote by T DC (respectively, T DC) the class of recognizer tissue P systems with cell division and communication rules (respectively, without environment). T DC(k) (respectively, T DC(k)) stands the class of recognizer tissue P systems with cell division whose communication rules have length at most k (respectively, without environment). It is worth pointing out that by using the dependency graph technique,138 has been shown that only problems in class P can be efficiently solved by means of recognizer tissue P systems with division rules which use communication rules of length exactly one, that is, P = PMCT DC(1) .140 However, a polynomial time solution for the HAM-CYCLE problem (given a directed graph, to determine whether or not there exists a Hamiltonian cycle in the graph), a well-known

page 297

August 2, 2021

298

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

NP-complete problem,141 has been given by a family of systems from TDC(2).240 In Ref. [142], has been shown that any tissue P system with cell division can be simulated by a tissue P system with cell division and without environment in an efficient way, that is, for each k ≥ 1 we . Hence, the role of the environment have PMCT DC(k) = PMCT DC(k) in the framework of recognizer tissue P system with cell division and communication rules is irrelevant from the complexity point of view. 8.4. Applications of Membrane Computing Applications, especially practical applications, play a role of the engine of a train in the development of membrane computing. In the past two decades, membrane computing models have been used to solve a wide spectrum of practical application problems.5 This section focuses on the brief introduction of several representative examples, including modeling ecosystems with population dynamics P systems, path planning and control of autonomous mobile robots, fault diagnosis with spiking neural P systems and other practical applications like engineering optimization. 8.4.1. Modeling ecosystems with population dynamics P systems Membrane computing was not conceived from its origin as a modeling tool. As clearly stated in previous sections, this branch of natural computing emerged to provide computing models inspired from the structure and functioning of living cells. Theoretical studies established solid foundations in terms of computational power and complexity, thus justifying their interest as models of computation. However, as introduced at the beginning of this section, and even to the surprise of the founders of the discipline, relevant practical applications based on P systems started to show up after few years of existence. And if there is an area where such practical applications have specially stood out it is the ability of these computational

page 298

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

299

devices to model complex systems, and particularly capturing the population dynamics of ecosystems. The appearance of membrane systems as a computational modeling tool applied to real ecosystems took place in 2008,145 and this promising research avenue has been growing since then. These approaches have been consolidated as an alternative to classical models based on differential equations (DEs), with certain advantages in particular regarding flexible scenarios.5, 143 In contrast with DEs, membrane computing are not only formal/mathematical models but also computational models, thus not requiring approximate numerical methods. This novel modeling framework remarkably met the most important properties we may expect from a good formal model.144 According to Regev and Shapiro, such key properties that a good model should combine would be the following: relevance (capturing the basic properties of the phenomena under study, in terms of structure and dynamics); understandability (being approachable for the people familiar with the phenomena under study, and helping increasing their ability to comprehend the underlying processes); extensibility (providing the modularity to start from a simple model and allow the progressive inclusion of refinements while preserving most of the previous version, being scalable to higher levels of organization); and computationally tractable, so that experiments to study the system dynamics in different scenarios can run in reasonable time. With respect to the application of these membrane computing devices for the modeling of ecosystems, the specific types of P systems used imposed several conditions to capture the inherent randomness observed in nature, thus leading to the inception of new variants, diverging from those discussed in previous sections, following stochastic or probabilistic approaches. While the former were successfully applied at a micro level, the latter ones were definitely the choice in the modeling of ecosystems, leading to the definition of multi-environment models known as Population Dynamics P systems143 (PDP systems).

page 299

August 2, 2021

300

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

Definition 10. A PDP system of degree (m, q) with m ≥ 1, q ≥ 1, taking T ≥ 1 time units, is a tuple Π = (G, Γ, Σ, T, {Ej |1 ≤ j ≤ m}, RE , {Πk = (Γ, μ, R, {Mi,j |1 ≤ i ≤ q, 1 ≤ j ≤ m}, {pr,j |r ∈ R ∧ 1 j m}), 1 ≤ k ≤ m}) where G = (V, S) is a directed graph, with its nodes V = {e1 , e2 , . . . , em } representing environments; Γ is the working alphabet and Σ Γ is an alphabet describing the objects present in the environments; RE is a finite set of rules of the form p (x)ej −−−→ (y1 )ej1 · · · (yh )ejh , where x, y1 , . . . , yh ∈ Σ, (ej , ejl ) ∈ S, 1 ≤ l ≤ h, 1 ≤ k ≤ n, and p is a computable functions with domain {1, . . . , T }; Ej , 1 ≤ j ≤ m, are the multisets of objects initially present in the m environments; Πk = (Γ, μ, R, {Mi,j |1 ≤ i ≤ q, 1 ≤ j ≤ m}, {pr,j |r ∈ R ∧ 1 j m}), 1 ≤ k ≤ m, represents the single P system included in environment k (with the same skeleton (Γ, μ, R): alphabet, membrane structure and rules), maybe differing in parameter values inside each environment, possibly involving differences in initial multisets Mi,j , 1 ≤ i ≤ q, 1 ≤ j ≤ m, and pr,j probabilities pr,j in the skeleton rules R, of type: u[v]αi −−→ u [v ]βi . As this formal definition shows, a number of environments are distinguished, and they are generally used to represent different areas in an ecosystem. The communication rules among environments enable the models to capture the possible movements across areas. Inside each of them, the inherent complexities of the different possible processes and phenomena considered, in relation with the species and their environment, are captured by the structure of the inner P system contained, with all the rules involved in the dynamics of the species. Thus, rules may cover aspects related with feeding, growth, reproduction, mortality, migration, human factors, climatic effects, etc. The objects of the alphabets of the ecosystem typically include each individual of a species (possibly including indexes for gender, age, or others), resources, etc. Besides, certain parameters will control many aspects of each run of the model,

page 300

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

301

thus generating different initial populations, setups of the scenario describing resources availability, environmental factors status, etc. These parameters will potentially affect both the initial multisets and the probabilities affecting the rules. An initial version of this approach for a single environment was presented in 2008, as a proof of concept concerning an ecosystem focusing on Bearded vulture (Gypaetus barbatus) in the Pyrenees.145, 146 The results were promising, experimentally validated against real data but, more importantly, the modeling framework proved to be modular and flexible with the extension of the model in an improved version,147 by adding more species and features increasing the accuracy of the model. Then, geographical information was incorporated, and the framework expanded by another approach, called multienvironment probabilistic functional extended P systems 148, 149 (later renamed as PDP), aiming to capture, for instance, scavengers moving along different areas looking for food, with environmental conditions for each area; the same approach was followed to represent the spread of a disease across areas hosting a certain population.150 Along with the evolution of the syntax and semantics of the type of P systems used, the software to simulate these computational models was also subject to subsequent extensions and enhancements. Thus, several simulation algorithms were introduced, implementing in different ways the hybridization between probabilistic rules and their maximally parallel mode of application. In order to make it easier to compare their performance, a simplified abstract virtual ecosystem with three trophic levels (grass, herbivorous and carnivorous) was designed.151–153 The software platforms used are described in Section 8.5. On the other hand, as the research line devoted to model complex ecosystems with membrane computing devices, and more specifically those based on PDP systems, a step-by-step protocol for building computational models was developed.143 One of the more complex models based on PDP systems was presented, simulating the population dynamics of an invasive species, the zebra mussel, in the reservoir on Ribarroja.154 In such model, many areas were

page 301

August 2, 2021

302

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

considered, involving a number of soil and water parameters, along with fluid dynamics aspects and many possible human actions affecting the dynamics of the species involved, both in larval and adult stages, including vessel movements, water renewal flow depending on enterprise decisions, etc. In addition, a complex internal structure inside each environment was created for controlling the evolution in the two yearly reproductive cycles, depending on the temperatures of 18 different sections of the reservoir, and other processes influenced by the distribution of soil types was also considered in order to capture the effects in the survival of adult individuals settled. All in all, this model showed the ability of PDP systems to study the population dynamics in certain ecosystems where the number of processes, parameters and entities were extremely difficult to control and understand by using other approaches. More recently, new models of real ecosystems based on PDP systems have been designed. In Ref. [155], based on the ecological data of giant pandas in Chengdu Research Base of Giant Panda Breeding, a probabilistic membrane system with 1 environment was designed, representing giant panda individuals per age and gender, sources of food and control objects, and evolution rules representing the main biological processes affecting the species. After obtaining a good accuracy in capturing the trends of population dynamics of giant pandas in captivity, the model was extended to include more natural behaviors of the species in natural environment, based on the ecological data of giant pandas in China Giant Panda Conservation Research Center.156 Thus, a release module was added to the previous model, improving the accuracy of the solutions using P system compared with Ref. [155]. A more detailed overview of the history of ecosystems models based on P systems can be found in Ref. [157]. 8.4.2. Path planning and control of mobile robots Robots are physical devices acting in the real world and interacting with human beings and environment elements by means of sensors, motors and computation units. This is one of the main reasons by which robot motion planning is an important research area in

page 302

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

303

robotics. In general terms, the motion planning problem can be defined as follows: Given a start state, a goal state, a geometric description of the robot, and the location of the obstacles in the environment, find a sequence of commands that moves the robot gradually from the start state to the goal state. The classical solution divides the problem in three subproblems: global planning, local planning and PID control, where • The global planning considers the location of static obstacles in the environment (a precomputed map should be given as input), as well as the starting and goal positions. It computes a path avoiding obstacles before the robot starts to move. There are two versions of the global planning problem: (a) the feasibility problem (finding a feasible path, if one exists); (b) the optimality problem (finding a feasible path with minimal cost, where the cost is given by a computable function). • The local planning takes the given path and tries to move the robot following the route while considering kinematic and physical constraints as well as avoiding both static and dynamic obstacles. The output of the local planner is a sequence of velocity references in an open loop. • The PID control takes each velocity reference and commands the motors in order to maintain a constant velocity as close as possible to the velocity reference until the next reference is given. Membrane computing provides a new way of implementing in parallel architectures such algorithms. Nevertheless, one of the main problems encountered was related to the difficulty of working in a continuous space state. That was the reason why the Enzymatic Numerical P system framework (ENPS)106 — see Definition 6 in Section 8.2.7, was initially selected to model PID controllers. Remember that ENPS uses variables containing real numbers instead of multisets of objects. Therefore, several membrane controllers have been presented: In Ref. [158], the development of membrane controllers for mobile robots is formulated. In Ref. [159], a novel trajectory tracking

page 303

August 2, 2021

304

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

control approach for nonholonomic wheeled mobile robots using ENPS is proposed. In Ref. [160], a multi-behavior robot coordination controller is designed using ENPS. In Ref. [21], some current results and challenges about membrane controllers are presented. The control of robot swarms is also studied,3, 161, 162 where P colonies are used. It is worth to mention that in Ref. [161], the Robot Operating System (ROS) framework was used for the implementation. ROS is a well-known software framework using C++ and Python for implementing robot systems, including a large variety of drivers for sensors, actuators, as well as facilities to implement custom algorithms. The global planning problem has also been modeled using membrane computing. In Ref. [163], a modified membrane-inspired algorithm based on particle swarm optimization for mobile robot path planning is presented. In Ref. [164], the Rapidly-exploring Random Tree (RRT) strategy was used for solving the feasibility problem in global planning together with a new variant of ENPS called Random Enzymatic Numerical P Systems with Shared Memory (RENPSM). RRT is a classical approach in robotics for exploring the obstacle-free space before start moving. There are two variants of the algorithm: The RRT algorithm and the RRT∗ algorithm (RRT star ). The first one solves the feasibility problem, the second one provides good solutions for the optimality problem. The RRT and RRT∗ algorithms were finally modeled and simulated by using regular ENPS,165 this work also includes parallel implementations in CUDA and OpenMP. 8.4.3. Fault diagnosis with spiking neural P systems With the technological revolution in the twenty first century the complexity of the power systems has been increasing manifolds with each passing year. Power systems are composed of generators, transmission lines, busbars and transformers. These components are provided with protection systems containing protective relays (PRs), circuit breakers (CBs) and communication equipments. Power systems are equipped with the supervisory control and data acquisition (SCADA) system and are often threatened by the occurrence of faults

page 304

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

305

during the operation. Well-organized protection systems can quickly detect the fault in the power systems and activate its PRs to trip the corresponding CBs whenever a fault occurs in order to clear the fault. Moreover a large amount of alarm messages are received from SCADA system at the same time. Furthermore these messages are analyzed to identify the faults in the system. However, very often the received data is incomplete and also there is some amount of uncertainty in tripping of the PRs and CBs. It further makes the task of fault diagnosis more complex and difficult. In recent years some bio-inspired models have been introduced which can solve the problem of fault diagnosis of power systems efficiently. Fuzzy reasoning spiking neural P system is one such model. The fuzzy reasoning spiking neural P systems (FRSN P systems) was introduced in Ref. [166] and it can deal with the representation of fuzzy knowledge and complete fuzzy reasoning. Moreover, it is an ideal model to solve the problem of fault diagnosis having properties such as intuitive description of the problem, parallel and distributed computing architecture, dynamic structure, synchronization, nondeterminism, non-linearity. The structure of FSN P systems is as follows: Definition 11. A FRSN P system166 of degree m ≥ 1, is a construct Π = (A, σ1 , . . . , σm , syn, I, O), where A = {a} is the singleton alphabet (the object a is called spike); σi , 1 ≤ i ≤ m, is a neuron, with σi = (αi , τi , ri ) and (i) αi ∈ [0, 1] represents the (potential) value of spike contained in neuron σi (also called pulse value); (ii) τi ∈ [0, 1] represents the truth value associated with neuron σi ; and (iii) ri is a firing/spiking rule contained in neuron σi , of the form E/aα → aβ , where α, β ∈ [0, 1] and E is a regular expression; syn ⊆ {1, 2, . . . , m} × {1, 2, . . . , m}, i = j for all (i, j) ∈ syn, 1 ≤ i, j ≤ m, are synapses between neurons; I and O represent the input neuron set and output neuron set, respectively. Neurons in FRSN P system are classified into three classes: proposition neuron, AND -type rule neuron and OR-type rule neuron. The reasoning algorithm in Ref. [166] is based on the firing mechanism of the neurons. Moreover, this model is capable of representing

page 305

August 2, 2021

306

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

fuzzy production rules of a fuzzy diagnosis knowledge base visually. It can also effectively model the corresponding dynamic reasoning behavior. An approach for fault diagnosis of power systems was proposed in Ref. [167] based on fuzzy reasoning spiking neural P systems (FRSN P systems). Moreover, the proposed method can diagnose different single and multiple faults along with failed malfunctioning protective devices. Also the FRSN P systems-based diagnostic model167 has good fault tolerance capability. These models can be constructed in advance and can be stored in files and the diagnostic results can be obtained in no more than five reasoning steps. These inherent properties of this model make it an ideal model for complex applications. In Ref. [168], fuzzy reasoning spiking neural P systems with trapezoidal fuzzy numbers (tFRSN P systems) were used for fault diagnosis of power systems. In this model, in order to develop the inference ability of tFRSN P systems from classical reasoning to fuzzy reasoning, a matrix-based fuzzy reasoning algorithm based on the dynamic firing mechanism was proposed. Moreover, the neurons in tFRSN P systems can be divided into four types, that is, proposition neurons and three kinds of rule neurons: general, AND and OR. Also the pulse value contained in each neuron is represented by a trapezoidal fuzzy number in [0, 1] instead of a real number. In Ref. [169], fuzzy reasoning spiking neural P systems with real numbers (rFRSN P systems) have been proposed for fault diagnosis of electric locomotive systems. Moreover, using fuzzy production rules the relationships among breakdown signals and faulty sections in subsystems of electric locomotive systems have been investigated and according to these rules the fault diagnosis models are constructed for these subsystems. A fault diagnosis model for Shaoshan4 (SS4) electric locomotive systems based on rFRSN P systems was introduced using rules of three types, that is, GENERAL, AND and OR. A fault diagnosis method was introduced based on fuzzy reasoning spiking neural P systems170 (FDSN P systems) with trapezoidal fuzzy numbers. This approach was used to model the faulty section

page 306

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

307

along with an algebraic fuzzy reasoning algorithm which further helped identifying the faults of the power system. Moreover, from the results obtained it was clear that this model can effectively identify the faults in power transmission networks with single and multiple fault sections even with incomplete or unreliable data from SCADA. Similarly, in Ref. [171], a weighted fuzzy reasoning spiking neural P system (WFSN P system) was proposed for diagnosis of the faults occurring in a traction power supply system of high-speed railways. A modified fuzzy reasoning spiking neural P system called as MFRSN P system172 was also introduced to solve fault diagnosis problems in metro traction power systems. This model is also a rFRSN P system with three types of rules. A new variant of fuzzy reasoning spiking neural P systems was proposed in Ref. [173]. It was used as a diagnostic technique in analysing the power transformer faults based on dissolved and free gas analysis (DGA). In this model, IEC ratio has been used as input signature and it can accurately and quickly identify the faults in a power transformer. A variant of spiking neural P systems known as fuzzy reasoning spiking neural P system with interval-valued fuzzy numbers (ivFRSN P system) was introduced for investigating fault diagnosis of power systems.174 This model flexibly and effectively handles the incomplete and uncertain messages from the SCADA systems. Furthermore, intuitionistic fuzzy spiking neural P systems 175 (IFSNP systems) were introduced for identification of faults in power systems which receives incomplete and uncertain messages. Interval-valued fuzzy spiking neural P systems 174 (IVFSNP systems) were defined and applied in analysing the alarm messages from SCADA. A membrane computing fault diagnosis (MCFD) method, based on FRSN P systems, was introduced,176 helping to solve problems automatically. Apart from fuzzy reasoning spiking neural P systems and their variants that have been very efficient tools in solving fault diagnosis of power systems, optimization spiking neural P systems 177 (OSN P systems) have been considered. These OSN P systems have been used in formulating fault estimation problems as optimization

page 307

August 2, 2021

308

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

problems.178 This approach allows to automatically derive the minimal value of the objective function of the FSE problem and gives the fault section as output. Machine learning has been an important tool to solve many complex problems in real-life applications. The idea of machine learning also has been incorporated in the spiking neural P systems to solve the fault diagnosis of power systems problem. Adaptive fuzzy SN P systems 179 (AFSN P systems) are such models. In these models, the rules corresponding to weighted fuzzy production rules are constructed to carry out fuzzy reasoning and finally the results of fault diagnosis are obtained. The weights between proposition neurons are adjusted by using Widrow-Hoff learning rules. Most of the real-world problems are usually dynamic and the adaptive properties of AFSN P systems along with simple reasoning process provide an efficient framework for fault diagnosis. This model has been further improved in Ref. [180] where particle swarm optimization (PSO) algorithm has been integrated to optimize the learning algorithm of AFSN P systems. 8.4.4. Other applications Membrane computing models are also useful in solving problems related to engineering optimizations. It is well known that there exists a huge amount of computationally hard engineering problems which are intractable. These kinds of problems can be solved by formulating them as optimization problems where an approximate acceptable solution can be provided instead of an analytic solution. Membrane algorithms represent an effective approach to solve computationally hard problems because of its parallel and distributed architecture, flexible evolution of rules and good convergence properties. Moreover, these approaches have been used to solve engineering problems in areas such as radar emitter signal analysis, digital image processing, constrained manufacturing parameter optimization problems. The optimization of the time-frequency atom decomposition process of radar emitter signals has been done by modified membrane algorithm based on quantum-inspired evolutionary algorithms and P systems 181 (MQEPS). Membrane algorithm with quantum-inspired

page 308

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

309

evolutionary systems 182 (MAQIS) has been used to solve image sparse decomposition problem. Membrane algorithm designed with a tissue P system and differential evolution 183 (DETPS) has been used to solve manufacturing parameter optimization problems and adaptive membrane evolutionary algorithms 184 (AMEA), having dynamic membrane structure and the differential evolution with the adaptive mutation factor, have been used for solving constrained engineering optimization problems. 8.4.5. Concluding remarks A wide range of practical applications of membrane computing have been investigated, however, the exploration of the potential killer applications remains a challenging task for the future. The investigation of the minimum of individuals in a population of an ecosystem like giant pandas is a promising topic in modeling ecosystems with population dynamics P systems. Further applications of the path planning techniques and membrane controllers for robot real applications and usage of online diagnosis are some of the most promising developments in the area of fuzzy reasoning spiking neural P systems. 8.5. Implementation of P Systems The model of P systems is very attractive because it features a parallel and synchronous evolution as well as the space growth in the case of active membranes. Very quickly a need for the software simulation/hardware implementation of the model arose. There are two main reasons for that. Firstly, having a good simulator allows to test the correctness of the functioning of theoretical constructions and thus to develop more complex algorithms using P systems. Secondly, in many cases P systems started to be used for real-world applications, especially in biological modeling, which required high speed, robust and scalable simulators. Since there exist many types of P systems, there are several implementations mostly targeting different variants or subvariants of them. Beside these implementations, two types of hardware

page 309

August 2, 2021

310

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

implementations are considered — using Graphical Processing Units (GPUs) or Field-Programmable Gate Arrays (FPGA) as hardware support. In the first case GPUs from graphic cards are used for general purpose computations allowing to simulate some variants of P systems. Usually, a speed-up of order 2–20 is obtained using such simulators. In the second case, special digital circuits are designed that are highly specific to the simulated instance of the system. This approach allows to obtain speedups of order 105 , however this does not take into account the time needed to transfer the data to/from FPGA, so in the real case situation the computation speed and the speedup depends on the connection between the data source and the FPGA. In what follows we will briefly discuss the main implementations of P systems using different technologies. 8.5.1. Software implementations Previous sections made clear the relevance of the achievements of membrane computing from both a theoretical perspective and as a practical modeling framework with applications in ecology, robotics, economy and other areas. Thus, as computing devices, P systems appear to be efficient devices in solving NP-hard problems. A first goal was to provide simulators that allow to understand different patterns of behavior. These tools helped in analyzing theoretical models, and also enable experimentations with various scenarios and hypotheses. Several overview papers have analyzed the contribution of various general-purpose simulation tools for P systems185–187 or restricted to GPU-based simulators (a broad spectrum of P systems,188 or limited to models such as PDP systems,189 SN P systems190 ). A very thorough recent survey27 (preliminary version in Ref. [191]) includes all the current developments on P system tools and their usage until 2019. In the rest of this section, we will briefly highlight the main facts related with the evolution of the simulators developed by the community of membrane computing.

page 310

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

311

Thus, initially, the focus was on the development of tools providing a better understanding of the newly created devices, with the aim to support research, but also having a pegagogical purpose in mind. These are meant to assist in the design and verification of different investigations. Most of them provide sequential solutions, but also the first parallel simulator was presented,192 in the form of a C++ implementation using MPI to communicate with threads. On the other hand, there were investigations on implementing P systems in silico through hardware components, specifically FPGA.193 Later on, especially between 2005 and 2010, according to Ref. [186], the diversity of P system implementations grew up by including, along with many variants of classical cell-like, tissuelike and SN P systems, other new stochastic and probabilistic variants. The range of tools covering mostly theoretical models and their usage in solving NP-hard problems, has been expanded with implementations simulating biological systems, including probabilistic or stochastic approaches. Simultaneously, other lines of investigation in producing general-purpose tools, applying general specification and execution strategies working independently of the type of model, being able to handle equally cell-based or tissuebased P systems or neuronal systems or multiple environments. This approach is illustrated by P-lingua framework194, 195 (probably the most widely used simulation tool in membrane computing, currently in use), aiming to facilitate the definition of P systems, by using a specification language that includes different types and variants of P systems, and a number of tools to parse, debug and simulate those variants. Additionally, a first in vitro implementation of a kind of membrane system, using test tubes as membranes and DNA molecules as objects, evolving under the control of enzymes was reported.196 More recently, another P system in vitro implementation based on multivesicular liposomes was presented.28 In later stages in the development of the discipline, simulation tools crossed the boundaries of the P system community, providing new membrane computing based tool assistants allowing to be used as black boxes with adequate interfaces that help in defining,

page 311

August 2, 2021

312

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

simulating and analyzing domain specific decision-making processes. A set of P system applications, conceived with the participation of experts from various fields, especially biology, led to significant outcomes.27 Some of the most relevant contributions, apart from those in biology mentioned above, are in robot control and swarm robotics.161 The main results obtained were collected and presented in Ref. [187] and later updated in Ref. [27]. Some of the main achievements in this period, used extensively for years, were MetaPlab,197 a computational framework for metabolic P systems, Infobiotics Workbench198 for stochastic systems, kPWorkbench199 for kernel P systems and new extensions of P-Lingua framework,200 including MeCoSim,201 a general-purpose configurable environment for experimentations with computational models based on different types of P systems, and widely used in real applications in ecology. These last products mentioned above provide mostly sequential simulators, focused on problem solving but not particularly addressing efficiency as their main goal. In the following sections, other approaches using high performance computing to get closer to more efficient implementations of certain aspects of P systems will be described in detail. Along with this line betting for efficiency, other research direction is currently devoted to generalize the simulation tools available to a greater extent, allowing the definition of models covering combinations of features of membranes systems not studied so far, as UPSimulator,202 a brand new tool whose impact in the solution of real problems is still to be determined, and the new version of P-Lingua, called P-Lingua 5,26 including a language to define meta-models, and opening the possibility to specify the syntax and semantics of newly considered P system variants declaratively, instead of programmatically. 8.5.2. GPU-based hardware implementations In this section, we will discuss the current trend of parallel simulation of P systems using GPU computing. First, we will introduce the technology, and later we will discuss current developments and results.

page 312

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

313

8.5.2.1. GPU computing Nowadays, the component integration scale of some high-end GPUs has outpaced the CPUs for the booming demand of graphics processing (advanced rendering, real-time 3D graphics in video games, etc.).203 Although a GPU is not a general processing unit, recent techniques and technologies enable the possibility to program them for other purposes rather than graphics. This kind of techniques are known as general-purpose computing on graphics processing units (GPGPU) or, in short, GPU computing. For example, using CUDA or OpenCL, one can parallelize their code over thousands of lightweight processing cores inside a single GPU. Only NVIDIA GPUs support CUDA, while OpenCL is supported by the majority of GPUs (including AMD and Intel). Most P system simulators on GPUs use CUDA because of its maturity, performance and support in several programming languages (mainly C++ and Python). CUDA programming model is based on heterogeneous computing,203 where the CPU (host) is the master node that launches kernels (functions) on the GPU (device). A kernel is executed by a grid of (thousands of) threads. The grid is a two-level hierarchy, where threads are grouped into equally-sized thread blocks where they can be easily synchronized by using barrier operations. Moreover, the memory hierarchy is explicitly managed. A GPU basically contains a global memory, which is the largest but the slowest memory, and shared memory, which is the smallest but fastest memory.203 Global memory is accessed by all threads launched in all grids and also by the host, but shared memory is only accessible by threads in a block. 8.5.2.2. GPU simulators of P systems Since the introduction of CUDA in 2007, the GPU has been considered as a platform to speed up the simulation of P systems. A survey of this can be found in Ref. [204], where it is concluded that the GPU is a suitable platform for this purpose because it offers: good performance at low cost (for example, the NVIDIA RTX2080 costs around 800$ and can achieve a peak of 10.07 TFLOPS for singleprecision and 448.0 GB/s of global memory bandwidth); an efficiently

page 313

August 2, 2021

314

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

synchronized platform (GPUs implement a shared memory system, avoiding communication overload); a medium scalability degree (for example, a RTX2080 includes 3072 cores and 8 GBytes of memory, but there are scalable multi-GPU systems requiring communication among nodes); low-medium flexibility (kernels can use common data structures as in host code, but both the algorithm and the data structures should be adapted on GPUs for best performance). In short, the GPU offers a good platform for P system simulation because they are inexpensive parallel hardware with a large memory bandwidth and high degree of parallelism, offering a relatively flexible way to program. Most importantly, we will categorize each simulator as generic or specific: a GPU-based simulator is generic if it aims at simulating P system models from a whole P system variant, and it is specific if it is designed for a specific, restricted, family of P system models.205 The latter, of course, can achieve larger accelerations because they can better adapt to the P systems to be simulated. In Refs. [204, 206], it is stated that P system simulators are memory bandwidth bounded. Thus, GPUs might help in this regard. Source codes of most of the following simulators can be found under the open-source PMCGPU project.207 Furthermore, Table 8.1 summarizes the GPU simulators described next, showing the variant, the type of simulator (generic or specific), restrictions of the models supported or the family of P systems, authors and references, and peak speedup. P systems with active membranes was the first variant to be simulated on GPUs.208 The design of this generic simulator maps the double parallelism of P systems (membranes and rules) to the GPU (thread blocks and threads). In this first simulator, and by using a Tesla C1060 GPU (240 cores), up to 7× of speedup was reported for a toy example devoted to benchmarking, and 1.67× for a family of P systems with active membranes solving SAT in linear time. The natural continuation of this work was to develop a specific simulator for the family solving SAT. This simulator followed a similar design by mapping the double parallelism. In this case, the simulator knows better which kind of rules are going to be executed at every step, so it launches dedicated kernels for each stage. This leads

page 314

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

Membrane Computing Concepts Table 8.1. Variant P systems with active membranes

Summary of main GPU simulators defined to date. Type Generic Specific

Supported models

Reference

Peak speedup

2-level membrane hierarchy SAT

2010 [208]

7× (C1060)

2010–2012 [206, 209] 2014 [210]

13,100× (4 × M2050) 38× (GTX680)

2013 [211] 2011–2013 [212, 213] 2011–2019 [190, 214–216]

10× (C1060) 8× (GT240)

Generic

Any

Tissue P systems

Specific Specific

SAT (cell division) Image processing

Spiking Neural P systems

Generic

With delays, parallel non-determinism Fuzzy Reasoning Any, defining modules

Population Dynamics P systems Enzymatic Numerical P systems

315

Generic Generic

2× (GTX1070)

2015 [217] 2012–2020 [189, 218, 219]

Unknown info. 30× (P100)

Generic

Any

2013 [220]

11× (GTX460M)

Specific

RRT and RRT*

2020 [221]

24× (RTX2080)

Kernel P systems

Specific Specific

2013 [222] 2018 [223]

EvolutionCommunication P systems Membrane Algorithm

Generic

Subset sum Multi-objective binary PSO With energy and without antiport rules Point Set Matching and TSP

10× (GT650M) 3× (unknown) Unknown

Specific

2012 [224]

2014 [225]

14× (GTX560)

Notes: Variant is the target P system type, Type indicates whether it is generic or specific, Supported models shows the main restrictions if generic or the solved problem by the family of P systems if specific, Reference gives the main citations and years of publication, and Peak speedup says the largest acceleration reported and the employed GPU.

to accelerations of up to 63× for the same GPU in a first version,209 and an extra improvement of 60–90% when applying specific GPU optimizations.206 The latter also report up to 13,100× of speedup when scaling to multi-GPU systems. Further improvements to the generic simulator followed by reporting up to 38× of speedup,210 mainly based on using shared memory, minimizing data transfers and grouping rules with a dependency graph. The latter is constructed

page 315

August 2, 2021

316

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

by grouping rules having common objects in the left-hand side and in the right-hand side. In order to explore which P system ingredients are better suited to be handled by GPUs,211 another solution to SAT based on a family of tissue P systems with cell division was simulated. The design of this specific simulator used the same concepts as in Ref. [209]. Experiments on a NVIDIA Tesla C1060 GPU showed an acceleration of 10×, demonstrating that the usage of charges associated to membranes helped to reduce instantiation of objects while being a lightweight ingredient to be represented and processed by CUDA threads. Furthermore, models of tissue P systems for some specific image processing tasks (smoothing and edge detection) have been simulated on CUDA.212, 213 The efficient simulation of Population Dynamics P (PDP) systems is critical for virtual experimentation and experimental validation of real ecosystem models. Let us recall that PDP systems are multienvironment and probabilistic models. Thus, simulating PDP systems in parallel is as simple as executing simulations and selection of rules within environments simultaneously. However, for GPUs, we need fine grained parallelism, and so we need to extract parallelism from the selection of rules. For this purpose, the simulation algorithm DCBA was implemented in both OpenMP and CUDA.218 Selection stage in DCBA consists of three phases in order to distribute objects among rules, check maximality and compute probabilistic distribution. Specific kernels were implemented for each phase. The CUDA simulator for PDP systems218 distributes environments and simulations through thread blocks, and rules among threads. It was first benchmarked with a set of randomly generated PDP systems (without biological meaning), achieving speedups of up to 7× for large sizes on a NVIDIA Tesla C1060 GPU over a multi-core version. In Ref. [189], the simulator was validated by using a known ecosystem model of scavenger birds, and tested to achieve speedups of up to 4.9× on a C1060, and 18.1× using a K40 GPU (2880 cores). It is clear that this simulator is generic, because it needs to simulate any model in the variant for a better flexibility in virtual experimentation tools. However, a middle-term solution can be taken: adaptative

page 316

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

317

simulators. An adaptative simulator receives high-level information in order to better adapt to the simulated P system.219 This is done in PDP systems by extending P-Lingua 5 to support a directivelike ingredient called feature. PDP system designers can provide information of the modules of the algorithmic scheme of the model. The simulator then can separate the rules belonging to each module and apply DCBA more effectively, reporting up to 2.5× of extra speedup compared to the baseline GPU simulator and 30× overall with a P100.219 Spiking Neural P (SNP) systems have also been simulated on GPUs. The core of these generic simulations is a matrix representation that can be employed in order to easily compute SNP system transitions13 : a spiking transition matrix indicates the synapses, the configuration vector represents the spikes in each neuron in a given step, and the spiking vector tells which rules are going to be fired. As it can be seen in the literature, GPUs are good at parallelizing these kind of algebra operations, and hence it was employed in the first CUDA simulator for SNP systems without delays achieving 6.8×.214 This simulator, named CuSNP, was improved in Ref. [215] by covering computation paths sequentially, leading to speedups of up to 2.31×.215 A set of extensions to CuSNP were implemented in order to support SNP systems with delays, more types of regular expressions and P-Lingua input files through binary translation.216 By using a GTX 750 GPU, the achieved speedups are of up to 50× for very large instances. More recently, in Ref. [190], further extensions to CuSNP were implemented in order to fully support non-deterministic SNP systems with delays by simulating each path in parallel. Simulations using GTX 1070 on uniform and non-uniform solutions to the subset problem report more than 2× of speedup. Another GPU simulator for SNP systems was introduced for the Fuzzy Reasoning Spiking Neural P systems.217 This simulator is basically implemented within pLinguaCore in Java and uses the binding library JCUDA, and the design is also based on a matrix representation. Likewise, a matrix representation was also employed to simulate Evolution-Communication P systems with energy and without antiport rules on GPUs.224

page 317

August 2, 2021

318

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

Other variants such as Enzymatic Numerical P systems have been simulated on GPUs. A first generic simulator was developed using CUDA.220 The implemented simulation algorithm reproduces the stages of a ENP system model: (1) Selection of applicable programs, (2) calculation of production functions and (3) distribution of production function results according to repartition protocols. Production functions are computed using a recursive solution. In general, the GPU design parallelizes the execution of programs among threads. By using a GeForce GTX 460M, the achieved speedup was of up to 11×. In Ref. [221], the RRT and RRT* algorithms for robotic path planning have been modeled with ENP systems, and a specific simulator in CUDA was provided. This simulator harnesses GPU primitives such as reduction in order gain accelerations of up to 24× on a RTX2080. Finally, an implementation of multi-objective binary PSO with Kernel P System on GPUs223 reported up to 3× of speedup, and a specific simulator for a solution to subset sum with kernel P systems on CUDA gave up to 10×.222 Membrane algorithms have also been implemented for specific problems with around 14× of speedup.225 8.5.3. FPGA-based hardware implementations An FPGA device is a reconfigurable hardware allowing to prototype digital circuits. It is composed of a bi-dimensional matrix of logical blocks (CLB), each of them containing several slices. Every slice contains a look-up table (LUT) and a flip-flop (FF). A LUT allows to implement any Boolean function of a fixed arity (usually 5–7), and a flip-flop is a single bit memory driven by a clock. CLBs are interconnected by switch boxes routing signals between blocks, so theoretically it is possible to route any output from one block to an input of any other block. Also, an FPGA device contains a number of input/output pins that allow to connect the device to the external world. The implementation of a circuit in FPGA is performed by the specification of used LUT functions as well as by the definition of the wiring, which is performed by an appropriate manipulation of interconnect switches. It is clear that any switching circuit (Boolean circuit with two values: 0 and 1) can be obtained in such a manner.

page 318

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

319

Several FPGA vendors exist, however only Xilinx devices were used in the area of P systems. The circuit description is done using a hardware description language (HDL), the most common ones being VHDL and Verilog. The design can be performed on several levels of abstraction, ranging from the switch/transistor level until the behavioral level (corresponding to a Mealy machine226 ). Since the circuit design is a relatively tedious task, the choice that was made for the implementation of P systems is to construct a particular circuit for each instance of considered P system. This allows to simplify the overall design, but requires additional software that will generate an HDL description based on the concrete P system parameters (rules, membrane structure and initial objects). So, when discussing a particular implementation of P systems using FPGA, one has to take into account that it is done in two steps — first a specific software is used to input required P system parameters outputting an HDL description of the circuit; then vendor-specific tools are used to transform the HDL description into bitstream used to configure the hardware. Finally, the circuit is executed in hardware and different techniques are used to extract the output. Since the FPGA technology allows to build customized parallel hardware circuits, very naturally, it was considered as a candidate for P systems implementations from the very beginning of the field. Reference [227] presents the design of the first FPGA circuit simulating transitional P systems. The authors implemented several interesting features, like the communication with inner membranes, priorities between rules and membrane creation and dissolution. The drawbacks of the proposed design are two strong assumptions about the functioning of the system: sequential evolution in each membrane (only one rule can be applied at each step) and the determinism of the simulation (the rules are executed in a predefined order). Also, the proposed implementation of creation/dissolution of membranes was based on the pre-creation of membranes and their enabling/disabling when needed. Finally, as first attempt, the scalability of the proposed methods is not very big, allowing to construct toy examples only. The next step was performed by Nguyen228, 229 that attacked the problem of the non-deterministic evolution in transitional P systems.

page 319

August 2, 2021

320

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

The authors proposed several algorithms230 that allow to probabilistically choose a multiset of applicable rules from the set of multisets of applicable rules at some step. However, the proposed algorithm does not guarantee the uniformity of the choice, so the evolution is not completely non-deterministic. It is worth to note that the used algorithms are conceptually similar to those used in the CUDA implementations of P systems for handling the non-determinism. The same authors proposed two different types of designs: using rules or membranes as the main computational unit of the system as well as different collision resolution strategies (when the same multiset should be updated in parallel by several processes). Most of the proposed designs were implemented in hardware and a graphical generator of HDL code, named Reconfig-P, was developed.231–233 The proposed designs are scalable and can be reused for the construction of real-world simulators. A different approach was followed in Refs. [234, 235]. Instead of simulating a class of P systems, the authors concentrated on the conditions that need to be fulfilled in order to ensure an efficient implementation. They considered P systems as a flattened set of rules,236, 237 hence disregarding the membrane structure, allowing to reason in terms of rule dependency (rules that partially require same objects to be applied). It was shown that for particular cases of rules’ dependencies it is possible to reduce the problem of the non-deterministic choice of the multiset of applicable rules to the problem of the number of words of some size in a regular language, which in turn reduces to the construction of numeric recurrences. Then the rule choice would correspond to the computation of a recurrence and to a run of a finite-state automaton constructing the corresponding rule set. The experimental results exhibit a faster clock rate and smaller ressource consumption than in previous cases, so the approach can be used for real cases (allowing to achieve a speed-up of 105 with respect to a software simulation). The drawback of the method is that it is limited only to classes of P systems that exhibit some specific properties (at the same time there are universal P systems having them).

page 320

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

321

In the simulations discussed above the main target were variants of transitional P systems, which in rule-centric designs correspond to most cases of P systems with static structure.237 In Ref. [238], the class of Numerical P systems was targeted. Also, it is supposed that the system works in all-parallel derivation mode, that is, all rules present in a membrane are all executed in parallel sharing corresponding values of variables. The paper also proposed a generalization of (Enzymatic) Numerical P systems, enriching the rules with predicates written in Presburger arithmetic. It was shown that the corresponding model can be easily translated to HDL in a very efficient manner and a specific compiler was developed for this task. The main advantage of the translation is that if linear functions are used, then the corresponding system can be run in hardware at the speed of the clock (each step of the P system corresponds to one clock tick in hardware), thus allowing to reach the speed of 4 × 108 computational steps per second, that is at least five orders of magnitude faster than any existing software simulator of numerical P systems. Also, the conducted experiments have shown a very low ressource consumption. This allowed to use the obtained methods for real-world applications in robotics, for example, for the hardware implementation239 of an obstacle-avoidance controller for wheeled robots. It is worth noting that in this last work for the first time it was explicitly pointed out that in the case of real applications the high speed execution is limited by the speed of the input and output. In fact, in all the previous FPGA implementations of P systems the question of how to input the data and how to output the result was never discussed and the authors used tools like the simulation or the hardware debugger to get the results. But these methods are of course impractical in a real case usage. So, for the first time, the work in Ref. [238] performed complete tests that included gathering the input via a specific channel (IO pin or serial port), computation and outputting the data via another channel (IO pin or serial port). This allows to design circuits that work like transducers or digital signal processors and that are ready to be plugged as computational units for real applications, like in Ref. [239].

page 321

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

322

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

8.5.4. Discussion There is a clear need for the development of implementations of P systems in order to assist the growing demand for the applications of the model. In the last years there is a trend for the standardization of description formats, allowing to interchange used tools. At the moment of writing, the main tool in the area of software- and GPUbased implementations of P systems is P-lingua, which is a very flexible and extensible environment for P systems simulation. In the case of FPGA implementations we can observe that they are not user-friendly — their usage still requires a lot of manual intervention and understanding of circuit design principles. The future efforts in the area are certainly related to the completion and automatization of different toolchains including automatic model learning, simulation and verification. In the FPGA case, the main goal is to achieve a better usability that includes the automatic synthesis of input/output procedures and modules that will feed or gather the data to/from the FPGA core. For the GPU research line, main efforts will be devoted to lightweight the both simulators and the models in order to better fit the device architecture. Moreover, exploring new technology in GPUs, like tensor cores, can be very interesting, as well as simulating more P system models in a more generic way. Developments using other GPU environments as OpenCL or ROCm are under consideration.

8.6. Concluding Remarks This chapter aimed to provide a generic picture of the latest developments in Membrane Computing. There were presented some of the key concepts and defintions in Membrane Computing, referring to some of the most significant theory of computation results of this nature-inspired computing paradigm. A set of applications in various areas, together with tools helping the simulation, anlysis and verification of membrane systems, were also discussed.

page 322

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

323

References 1. Gh. P˘ aun, Computing with membranes. J. Comput. Syst. Sci. 61(1), 108– 143 (2000). A preliminary version as CS-TUCS Report No. 208, Turku Center, 1998. 2. Gh. P˘ aun, G. Rozenberg, and A. Salomaa (eds.), The Oxford Handbook of Membrane Computing (Oxford University Press, 2009). 3. A. Florea and C. Buiu, Membrane Computing for Distributed Control of Robotic Swarms: Emerging Research and Opportunities (IGI Global, Pennsylvania, 2017). 4. G. Zhang, J. Wang, T. Wang, and J. Zhu, Membrane Computing: Theory and Applications (Science Press, Beijing, 2015). 5. G. Zhang, M. Pérez-Jiménez, and M. Gheorghe, Real-life Applications with Membrane Computing. vol. 25 (Springer, 2017). 6. L. Pan, Gh. P˘ aun, and G. Zhang, Foreword: Starting. J. Membr. Comput. 1, 1–2 (2019). 7. B. Aman and G. Ciobanu, Synchronization of rules in membrane computing. J. Membr. Comput. 1(4), 233–240 (2019). 8. V. Manca, From biopolymer duplication to membrane duplication and beyond. J. Membr. Comput. 1(4), 292–303 (2019). 9. V. Manca, Metabolic computing. J. Memb. Comput. 1(3), 223–232 (2019). 10. L. Ciencialov´ a, E. Csuhaj-Varj´ u, L. Cienciala, and P. Sos´ık, P colonies. J. Membr. Comput. 1(3), 178–197 (2019). 11. Y. Jiang, Y. Su, and F. Luo, An improved universal spiking neural P system with generalized use of rules. J. Membr. Comput. 1(4), 270–278 (2019). 12. R. de la Cruz, F. Cabarle, and H. Adorna, Generating context-free languages using spiking neural P systems with structural plasticity. J. Membr. Comput. 1(3), 161–177 (2019). ´ Mart´ınez-del-Amor, L. Pan, and M. Pérez13. X. Zeng, H. Adorna, M.A. Jiménez. Matrix representation of spiking neural P systems. In M. Gheorghe, T. Hinze, Gh. P˘ aun, G. Rozenberg, and A. Salomaa (eds.), Membrane Computing, Lecture Notes in Computer Science (Springer, Berlin Heidelberg, 2011), pp. 377–391. 14. G. Rom´ an, Inference of bounded L systems with polymorphic P systems. J. Membr. Comput. 1(1), 52–57 (2019). 15. Z. Gazdag and G. Kolonits, A new method to simulate restricted variants of polarizationless P systems with active membranes. J. Membr. Comput. 1(4), 251–261 (2019). 16. D. Orellana-Mart´ın, L. Valencia-Cabrera, A. Riscos-N´ un ˜ez, and M. PérezJiménez, P systems with proteins: A new frontier when membrane division disappears. J. Membr. Comput. 1(1), 29–39 (2019). 17. A. Leporati, L. Manzoni, G. Mauri, A. Porreca, and C. Zandron, Characterizing PSPACE with shallow non-confluent P systems. J. Membr. Comput. 1(2), 75–84 (2019).

page 323

August 2, 2021

324

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

18. D. Orellana-Mart´ın, L. Valencia-Cabrera, A. Riscos-N´ un ˜ez, and M. PérezJiménez, Minimal cooperation as a way to achieve the efficiency in cell-like membrane systems. J. Membr. Comput. 1(2), 85–92 (2019). 19. J. Cooper and R. Nicolescu, Alternative representations of P systems solutions to the graph colouring problem. J. Membr. Comput. 1(2), 112–126 (2019). 20. P. Sos´ık, P systems attacking hard problems beyond NP: A survey. J. Membr. Comput. 1(3), 198–208 (2019). 21. C. Buiu and A. Florea, Membrane computing models and robot controller design, current results and challenges. J. Membr. Comput. 1(4), 262–269 (2019). 22. E. S´ anchez-Karhunen and L. Valencia-Cabrera, Modelling complex market interactions using PDP systems. J. Membr. Comput. 1(1), 40–51 (2019). 23. A. Nash and S. Kalvala, A P system model of swarming and aggregation in a Myxobacterial colony. J. Membr. Comput. 1 (2), 103–111 (2019). 24. D. D´ıaz-Pernil, M. Gutiérrez-Naranjo, and H. Peng, Membrane computing and image processing: A short survey. J. Membr. Comput. 1(1), 58–73 (2019). 25. A. T ¸ urlea, M. Gheorghe, F. Ipate, and S. Konur, Search-based testing in membrane computing. J. Membr. Comput. 1(4), 241–250 (2019). 26. I. Pérez-Hurtado, D. Orellana-Mart´ın, G. Zhang, and M. Pérez-Jiménez, P-Lingua in two steps: Flexibility and efficiency. J. Membr. Comput. 1(2), 93–102 (2019). ´ Mart´ınez-del-Amor, and 27. L. Valencia-Cabrera, D. Orellana-Mart´ın, M.A. M. Pérez-Jiménez, An interactive timeline of simulators in membrane computing. J. Membr. Comput. 1(3), 209–222 (2019). 28. R. Mayne, N. Phillips, and A. Adamatzky, Towards experimental P-systems using multivesicular liposomes. J. Membr. Comput. 1(1), 20–28 (2019). 29. P. Bottoni, A. Labella, and G. Rozenberg, Reaction systems with influence on environment. J. Membr. Comput. 1(1), 3–19 (2019). 30. V. Mitrana, Polarization: A new communication protocol in networks of bio-inspired processors. J. Membr. Comput. 1(2), 127–143 (2019). 31. R. Andonie, Hyperparameter optimization in learning systems. J. Membr. Comput. 1(4), 279–291 (2019). 32. G. Ciobanu, Gh. P˘ aun, and M. Pérez-Jiménez, Appl. Membr. Comput. (Springer, 2006). 33. P. Frisco, M. Gheorghe, and M. Pérez-Jiménez, Applications of Membrane Computing in Systems and Synthetic Biology, vol. 7 (Springer, 2014). 34. E. Bartocci and P. Li´ o, Computational modeling, formal analysis, and tools for systems biology. Plos Comput. Biol. 12(1), e1004591 (2016). 35. P. Sos´ık, The power of catalysts and priorities in membrane systems. Grammars. 6(1), 13–24 (2003). 36. S. Krishna and A. P˘ aun, Results on catalytic and evolution-communication P systems. New Gen. Comput. 22(4), 377–394 (2004).

page 324

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

325

37. O. Ibarra and H.-C. Yen, Deterministic catalytic systems are not universal. Theoret. Comput. Sci. 363(2), 149–161 (2006). 38. R. Freund, L. Kari, M. Oswald, and P. Sos´ık, Computationally universal P systems without priorities: Two catalysts are sufficient. Theoret. Comput. Sci. 330(2), 251–266 (2005). 39. R. Juayong and H. Adorna, Relating computations in non-cooperative transition P systems and evolution-communication P systems with energy. Fundamenta Informaticae 136(3), 209–217 (2015). 40. R. Juayong and H. Adorna, On simulating cooperative transition P systems in evolution–communication P systems with energy. Nat. Comput. 17(2), 333–343 (2018). 41. Gh. P˘ aun, P systems with active membranes: Attacking NP complete problems. J. Automata, Lang. Combinat. 6(1), 75–90 (2001). A preliminary version in Centre for Discrete Mathematics and Theoretical Computer Science, CDMTCS Research Report Series-102, May 1999, 16 pages. 42. A. Alhazov, L. Pan, and Gh. P˘ aun, Trading polarizations for labels in P systems with active membranes. Acta Informatica 41(2–3), 111–144 (2004). 43. A. Alhazov and T.-O. Ishdorj, Membrane operations in P systems with active membranes. In Proceedings of the Second Brainstorming Week on Membrane Computing, 37–44. Sevilla, ETS de Ingenier´ıa Inform´ atica, 2-7 de Febrero, 2004 (2004). 44. M. Mutyam and K. Krithivasan, P systems with membrane creation: Universality and efficiency. Lect. Notes Comput. Sci. 2055, 276–287 (2001). 45. C. Zandron, A. Leporati, C. Ferretti, G. Mauri, and M. Pérez-Jiménez, On the computational efficiency of polarizationless recognizer P systems with strong division and dissolution. Fundamenta Informaticae 87(1), 79–91 (2008). 46. A. Alhazov and L. Pan, Polarizationless P systems with active membranes. Grammars 7(1), 141–159 (2004). 47. R. Ceterchi, R. Gramatovici, N. Jonoska, and K. Subramanian, Tissuelike P systems with active membranes for picture generation. Fundamenta Informaticae 56(4), 311–328 (2003). 48. R. Freund and A. P˘ aun, P systems with active membranes and without polarizations. Soft Comput. 9(9), 657–663 (2005). 49. T.-O. Ishdorj and M. Ionescu, Replicative–distribution rules in P systems with active membranes. Lect. Notes Comput. Sci. 3407, 68–83 (2004). 50. G. Ciobanu, L. Pan, Gh. P˘ aun, and M. Pérez-Jiménez, P systems with minimal parallelism. Theoret. Comput. Sci. 378(1), 117–130 (2007). 51. T. Song, L. Mac´ıas-Ramos, L. Pan, and M. Pérez-Jiménez, Time-free solution to SAT problem using P systems with active membranes. Theoret. Comput. Sci. 529, 61–68 (2014). 52. B. Song and L. Pan, Computational efficiency and universality of timed P systems with active membranes. Theoret. Comput. Sci. 567, 74–86 (2015).

page 325

August 2, 2021

326

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

53. P. Frisco, G. Govan, and A. Leporati, Asynchronous P systems with active membranes. Theoret. Comput. Sci. 429, 74–86 (2012). 54. A. P˘ aun and Gh. P˘ aun, The power of communication: P systems with symport/antiport. New Gen. Comput. 20(3), 295–306 (2002). 55. R. Freund, A. Alhazov, Yu. Rogozhin, and S. Verlan, Communication P systems. In Gh. P˘ aun, G. Rozenberg and A. Salomaa (eds.), The Oxford Handbook of Membrane Computing (Oxford University Press, 2010), pp. 118–143. 56. P. Frisco and H. Hoogeboom, Simulating counter automata by P systems with symport/antiport. In Gh. P˘ aun, G. Rozenberg, A. Salomaa, and C. Zandron (eds.), Membrane Computing, International Workshop, WMCCdeA 2002, Curtea de Arges, Romania, August 19–23, 2002, Revised Papers, vol. 2597, Lecture Notes in Computer Science (Springer, 2002), pp. 288–301. 57. R. Freund and A. P˘ aun, Membrane systems with symport/antiport rules: Universality results. In Gh. P˘ aun, G. Rozenberg, A. Salomaa, and C. Zandron (eds.), Membrane Computing, International Workshop, WMCCdeA 2002, Curtea de Arges, Romania, August 19–23, 2002, Revised Papers, vol. 2597, Lecture Notes in Computer Science (Springer, 2002), pp. 270–287. 58. R. Freund and M. Oswald, P systems with activated/prohibited membrane channels. In Gh. P˘ aun, G. Rozenberg, A. Salomaa, and C. Zandron (eds.), Membrane Computing, International Workshop, WMC-CdeA 2002, Curtea de Arge¸s, Romania, August 19–23, 2002, Revised Papers, vol. 2597, Lecture Notes in Computer Science (Springer, 2002), pp. 261–269. 59. F. Bernardini and A. P˘ aun, Universality of minimal symport/antiport: Five membranes suffice. In C. Mart´ın-Vide, G. Mauri, Gh. P˘ aun, G. Rozenberg, and A. Salomaa (eds.), Membrane Computing, International Workshop, WMC 2003, Tarragona, Spain, July 17–22, 2003, Revised Papers, vol. 2933, Lecture Notes in Computer Science (Springer, 2003), pp. 43–54. 60. A. Alhazov and Yu. Rogozhin, Towards a characterization of P systems with minimal symport/antiport and two membranes. In H. Hoogeboom, Gh. P˘ aun, G. Rozenberg, and A. Salomaa (eds.), Membrane Computing, 7th International Workshop, WMC 2006, Leiden, The Netherlands, July 17–21, 2006, Revised, Selected, and Invited Papers, vol. 4361, Lecture Notes in Computer Science (Springer, 2006), pp. 135–153. 61. A. Alhazov, R. Freund, and M. Oswald, Symbol/membrane complexity of P systems with symport/antiport rules. In R. Freund, Gh. P˘ aun, G. Rozenberg, and A. Salomaa, Membrane Computing, 6th International Workshop, WMC 2005, Vienna, Austria, July 18–21, 2005, Revised Selected and Invited Papers, vol. 3850, Lecture Notes in Computer Science (Springer, 2005), pp. 96–113. 62. E. Csuhaj-Varj´ u, M. Margenstern, Gy. Vaszil, and S. Verlan, On small universal antiport P systems. Theoret. Comput. Sci. 372(2–3), 152–164 (2007).

page 326

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

327

63. E. Csuhaj-Varj´ u and Gy. Vaszil, P automata or purely communicating accepting P systems. In Gh. P˘ aun, G. Rozenberg, A. Salomaa, and C. Zandron, Membrane Computing, International Workshop, WMC-CdeA 2002, Curtea de Arge¸s, Romania, August 19–23, 2002, Revised Papers, vol. 2597, Lecture Notes in Computer Science (Springer, 2002), pp. 219–233. 64. R. Freund and M. Oswald, A short note on analysing P systems with antiport rules. Bull. EATCS. 78, 231–236 (2002). 65. E. Csuhaj-Varj´ u, O. Ibarra, and Gy. Vaszil, On the computational complexity of P automata. Nat. Comput. 5(2), 109–126 (2006). 66. E. Csuhaj-Varj´ u and Gy. Vaszil, P automata with restricted power. Int. J. Found. Comput. Sci. 25(4), 391–408 (2014). 67. E. Csuhaj-Varj´ u, P and DP automata: Unconventional versus classical automata. Int. J. Found. Comput. Sci. 24(7), 995–1008 (2013). 68. A. P˘ aun, Gh. P˘ aun, and G. Rozenberg, Computing by communication in networks of membranes. Int. J. Found. Comput. Sci. 13(6), 779–798 (2002). 69. R. Freund, Gh. P˘ aun, and M. Pérez-Jiménez, Tissue P systems with channel states. Theoret. Comput. Sci. 330(1), 101–116 (2005). 70. S. Verlan, F. Bernardini, M. Gheorghe, and M. Margenstern, Computational completeness of tissue P systems with conditional uniport. Lect. Notes Comput. Sci. 4361, 521–535 (2006). 71. F. Bernardini and M. Gheorghe, Cell communication in tissue P systems: Universality results. Soft Comput. 9(9), 640–649 (2005). 72. M. Cavaliere, Evolution–communication P systems. Lect. Notes Comput. Sci. 2597, 134–145 (2002). 73. E. Csuhaj-Varj´ u and S. Verlan, On generalized communicating P systems with minimal interaction rules. Theoret. Comput. Sci. 412(1–2), 124–135 (2011). 74. S. Krishna, M. Gheorghe, F. Ipate, E. Csuhaj-Varj´ u, and R. Ceterchi, Further results on generalised communicating P systems. Theoret. Comput. Sci. 701, 146–160 (2017). 75. S. Verlan, F. Bernardini, M. Gheorghe, and M. Margenstern, Generalized communicating P systems. Theoret. Comput. Sci. 404(1–2), 170–184 (2008). 76. E. Csuhaj-Varj´ u and S. Verlan, Computationally complete generalized communicating P systems with three cells. In M. Gheorghe, G. Rozenberg, A. Salomaa, and C. Zandron (eds.), Membrane Computing — 18th International Conference, CMC 2017, Bradford, UK, July 25–28, 2017, Revised Selected Papers, vol. 10725, Lecture Notes in Computer Science (Springer, 2017), pp. 118–128. 77. W. Reisig, Petri Nets: An Introduction. Springer, Monographs in Theoretical Computer Science (Springer-Verlag, Berlin, Heidelberg, 1985). 78. B. Song, C. Zhang, and L. Pan, Tissue-like P systems with evolutional symport/antiport rules. Inform. Sci. 378, 177–193 (2017). 79. J. Kelemen, A. Kelemenov´ a, and Gh. P˘ aun. Preview of P colonies: A biochemically inspired computing model. In Workshop and Tutorial Proceedings. Ninth International Conference on the Simulation and

page 327

August 2, 2021

328

80. 81.

82.

83. 84.

85.

86. 87.

88. 89.

90.

91. 92.

93. 94. 95.

96.

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al. Synthesis of Living Systems (Alife IX), Boston, Massachusetts, USA (2004), pp. 82–86. L. Pan, Gh. P˘ aun, and B. Song, Flat maximal parallelism in P systems with promoters. Theoret. Comput. Sci. 623, 83–91 (2016). B. Song, M. Pérez-Jiménez, Gh. P˘ aun, and L. Pan, Tissue P systems with channel states working in the flat maximally parallel way. IEEE Trans. Nanobioscience. 15(7), 645–656 (2016). L. Pan, Y. Wang, S. Jiang, and B. Song, Flat maximal parallelism in tissue P systems with promoters. Rom. J. Inform. Sci. Technol. 20(1), 42–56 (2017). Gh. P˘ aun, M. Pérez-Jiménez, and A. Riscos-N´ un ˜ez, Tissue P systems with cell division. Int. J. Comput. Commun. Control. 3(3), 295–303 (2008). D. D´ıaz-Pernil, M. Gutiérrez-Naranjo, M. Pérez-Jiménez, and A. RiscosN´ un ˜ez, A uniform family of tissue P systems with cell division solving 3-col in a linear time. Theoret. Comput. Sci. 404(1–2), 76–87 (2008). ´ RomeroD. D´ıaz-Pernil, M. Pérez-Jiménez, A. Riscos-N´ un ˜ez, and A. Jiménez, Computational efficiency of cellular division in tissue-like membrane systems. Rom. J. Inform. Sci. Technol. 11(3), 229–241 (2008). L. Pan and M. Pérez-Jiménez, Computational complexity of tissue-like P systems. J. Complex. 26(3), 296–315 (2010). L. Pan, B. Song, L. Valencia-Cabrera, and M. Pérez-Jiménez, The computational complexity of tissue P systems with evolutional symport/antiport rules. Complexity. 2018, 3745210:1–3745210:21 (2018). B. Song and L. Pan, The computational power of tissue-like P systems with promoters. Theoret. Comput. Sci. 641, 43–52 (2016). A. Alhazov, R. Freund, A. Leporati, M. Oswald, and C. Zandron, (tissue) P systems with unit rules and energy assigned to membranes. Fundamenta Informaticae. 74(4), 391–408 (2006). A. Christinal, D. D´ıaz-Pernil, and T. Mathu, A uniform family of tissue P systems with protein on cells solving 3-coloring in linear time. Nat. Comput. 17(2), 311–319 (2018). B. Song, L. Pan, and M. Pérez-Jiménez, Tissue P systems with protein on cells. Fundamenta Informaticae. 144(1), 77–107 (2016). H. Chen, M. Ionescu, T.-O. Ishdorj, A. P˘ aun, Gh. P˘ aun, and M. PérezJiménez, Spiking neural P systems with extended rules: Universality and languages. Nat. Comput. 7(2), 147–166 (2008). M. Ionescu, Gh. P˘ aun, and T. Yokomori, Spiking neural P systems. Fundamenta Informaticae. 71(2, 3), 279–308 (2006). Gh. P˘ aun, M. Pérez-Jiménez, and G. Rozenberg, Spike trains in spiking neural P systems. Int. J. Found. Comput. Sci. 17(04), 975–1002 (2006). M. Cavaliere, O. Ibarra, Gh. P˘ aun, O. Egecioglu, M. Ionescu, and S. Woodworth, Asynchronous spiking neural P systems. Theoret. Comput. Sci. 410(24–25), 2352–2364 (2009). H. Chen, R. Freund, M. Ionescu, Gh. P˘ aun, and M. Pérez-Jiménez, On string languages generated by spiking neural P systems. Fundamenta Informaticae 75(1–4), 141–162 (2007).

page 328

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

329

97. L. Pan, J. Wang, and H. Hoogeboom, Spiking neural P systems with astrocytes. Neural Comput. 24(3), 805–825 (2012). 98. J. Wang, H. Hoogeboom, L. Pan, Gh. P˘ aun, and M. Pérez-Jiménez, Spiking neural P systems with weights. Neural Comput. 22(10), 2615–2646 (2010). 99. T. Wu, A. P˘ aun, Z. Zhang, and L. Pan, Spiking neural P systems with polarizations. IEEE Trans. Neural Networks Learn. Syst. 29(8), 3349–3360 (2018). 100. X. Zhang, L. Pan, and A. P˘ aun, On the universality of axon P systems, IEEE Trans. Neural Networks Learn. Syst. 26(11), 2816–2829 (2015). 101. H. Peng and J. Wang, Coupled neural P systems. IEEE Trans. Neural Networks Learn. Syst. 30(6), 1672–1682 (2019). 102. F. Cabarle, H. Adorna, M. Pérez-Jiménez, and T. Song, Spiking neural P systems with structural plasticity. Neural Comput. Appl. 26(8), 1905– 1917 (2015). 103. X. Wang, T. Song, F. Gong, and P. Zheng, On the computational power of spiking neural P systems with self-organization. Sci. Rep. 6, 27624 (2016). 104. T. Wu, Z. Zhang, Gh. P˘ aun, and L. Pan, Cell-like spiking neural P systems. Theoret. Comput. Sci. 623, 180–189 (2016). 105. L. Pan, Gh. P˘ aun, G. Zhang, and F. Neri, Spiking neural P systems with communication on request. Int. J. Neural Syst. 27(08), 1750042 (2017). 106. A. Pavel, O. Arsene, and C. Buiu, Enzymatic numerical P systems — a new class of membrane computing systems. In Proceedings 2010 IEEE 5th International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA 2010), Liverpool, United Kingdom (January, 2010), pp. 1331–1336. 107. Gh. P˘ aun and R. P˘ aun, Membrane computing and economics: Numerical P systems. Fundamenta Informaticae 73(1, 2), 213–227 (2006). 108. C. Vasile, A. Pavel, I. Dumitrache, and Gh. P˘ aun, On the power of enzymatic numerical P systems. Acta Informatica. 49(6), 395–412 (2012). 109. Z. Zhang and L. Pan, Numerical P systems with thresholds. Int. J. Comput. Commun. Control. 11(2), 292–304 (2016). 110. L. Pan, Z. Zhang, T. Wu, and J. Xu, Numerical P systems with production thresholds. Theoret. Comput. Sci. 673, 30–41 (2017). 111. L. Liu, W. Yi, Q. Yang, H. Peng, and J. Wang, Numerical P systems with boolean condition. Theoret. Comput. Sci. 785, 140–149 (2019). 112. Z. Zhang, T. Wu, A. P˘ aun, and L. Pan, Numerical P systems with migrating variables. Theoret. Comput. Sci. 641, 85–108 (2016). 113. C. Zandron, C. Ferretti, and G. Mauri, Solving NP-complete problems using P systems with active membranes. Unconvent. Models Comput. 289–301 (2000). ´ Romero-Jiménez, and F. Sancho-Caparrini, Decision 114. M. Pérez-Jiménez, A. P systems and the P = NP conjecture. In Gh. P˘ aun, G. Rozenberg, A. Salomaa, and C. Zandron (eds.), Membrane Computing, vol. 2597, Lecture Notes in Computer Science (Springer, Berlin Heidelberg, 2003), pp. 388–399.

page 329

August 2, 2021

330

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

115. J. Dassow and Gh. P˘ aun, On the power of membrane computing. J. Univ. Comput. Sci. 5(2), 33–49 (1999). 116. M. Minsky, Computation: Finite and Infinite Machines (Prentice-Hall, 1967). 117. P. Sos´ık and R. Freund, P systems without priorities are computationally universal. In Gh. P˘ aun, G. Rozenberg, A. Salomaa, and C. Zandron (eds.), Membrane Computing, vol. 2597, Lecture Notes in Computer Science (Springer, Berlin Heidelberg, 2003), pp. 400–409. 118. P. Sos´ık and M. Langer, Small (purely) catalytic P systems simulating register machines. Theoret. Comput. Sci. 623, 65–74 (2016). 119. C. Ferretti, G. Mauri, Gh. P˘ aun, and C. Zandron, On three variants of rewriting P systems. Theoret. Comput. Sci. 1–3, 201–215 (05, 2003). 120. S. Abraham, Some questions of phrase structure grammars. Comput. Linguist. 4, 61–70 (1965). 121. S. Krishna, Languages of P systems. Computability and complexity, PhD Thesis (2001). 122. M. Madhu, Rewriting P systems: Improved hierarchies. Theoret. Comput. Sci. 334, 161–175 (2005). 123. M. Pérez-Jiménez, A. Romero-Jiménez, and F. Sancho-Caparrini, Complexity classes in cellular computing with membranes. Nat. Comput. 2(3), 265– 285 (2003). 124. S. Krishna and R. Rama, A variant of P-systems with active membranes: Solving NP-complete problems, PhD Thesis (1999). 125. M. Pérez-Jiménez, A. Romero-Jiménez, and F. Sancho-Caparrini, A polynomial complexity class in P systems using membrane division. J. Automata, Lang. Combinat. 11(4), 423–434 (2006). 126. N. Murphy and D. Woods, Uniformity is weaker than semi-uniformity for some membrane systems. Fundamenta Informaticae 134(1–2), 129–152 (2014). 127. M. Gutiérrez-Naranjo, M. Pérez-Jiménez, A. Riscos-N´ un ˜ez, and F. RomeroCampero, Characterizing tractability by cell-like membrane systems. In K G Subramanian, K Rangarajan, and M Mukund (eds.), Formal Models, Languages and Applications (World Scientific, 2007), pp. 137–154. 128. D. D´ıaz-Pernil, M. Pérez-Jiménez, and A. Romero-Jiménez, Efficient simulation of tissue-like P systems by transition cell-like P systems. Nat. Comput. 8(4), 797–806 (2009). 129. M. Pérez-Jiménez and A. Riscos-N´ un ˜ez, A linear-time solution to the knapsack problem using P systems with active membranes. In C. Mart´ınVide, G. Mauri, Gh. P˘ aun, G. Rozenberg, and A. Salomaa (eds.), Membrane Computing, vol. 2933, Lecture Notes in Computer Science (Springer, Berlin Heidelberg, 2004), pp. 250–268. 130. M. Pérez-Jiménez and A. Riscos-N´ un ˜ez, Solving the Subset-Sum problem by active membranes. New Gen. Comput. 23(4), 367–384 (2005). 131. M. Gutiérrez-Naranjo, M. Pérez-Jiménez, and A. Riscos-N´ un ˜ez, A fast P system for finding a balanced 2-partition. Soft Comput. 9, 673–678 (2005).

page 330

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

331

132. A. Alhazov, C. Mart´ın-Vide, and L. Pan, Solving Graph Problems by P Systems with Restricted Elementary Active Membranes. In N. Jonoska, Gh. P˘ aun, and G. Rozenberg (eds.), Aspects of Molecular Computing: Essays Dedicated to Tom Head, on the Occasion of His 70th Birthday (Springer, Berlin Heidelberg, 2004), pp. 1–22. 133. M. Pérez-Jiménez and F. Romero-Campero, An efficient family of P systems for packing items into bins. J. Univ. Comput. Sci. 10(5), 650–670 (2004). 134. M. Pérez-Jiménez and F. Romero-Campero, Attacking the common algorithmic problem by recognizer P systems. In M. Margenstern (ed.), Machines, Computations, and Universality, vol. 3354, Lecture Notes in Computer Science (Springer, Berlin Heidelberg, 2005), pp. 304–315, 135. A. Alhazov, C. Mart´ın-Vide, and L. Pan, Solving a PSPACE-complete problem by recognizing P systems with restricted active membranes. Fundamenta Informaticae 58, 67–77 (11, 2003). 136. A. Porreca, G. Mauri, and C. Zandron, Complexity classes for membrane systems. Informatique théorique et applications 40(2), 141–162 (2006). 137. P. Sos´ık and A. Rodr´ıguez-Pat´ on. P systems with active membranes characterize PSPACE. In C. Mao and T. Yokomori, DNA Computing, vol. 4287, Lecture Notes in Computer Science (Springer, Berlin Heidelberg, 2006), pp. 33–46. 138. M. Gutiérrez-Naranjo, M. Pérez-Jiménez, A. Riscos-N´ un ˜ez, and F. RomeroCampero, On the power of dissolution in P systems with active membranes. In R. Freund, Gh. P˘ aun, G. Rozenberg, and A. Salomaa (eds.), Membrane Computing, vol. 3850, Lecture Notes in Computer Science (Springer, Berlin Heidelberg, Berlin, Heidelberg, 2006), pp. 224–240. 139. A. Alhazov and M. Pérez-Jiménez, Uniform solution of QSAT using polarizationless active membranes. In J. Durand-Lose and M. Margenstern (eds.), Machines, Computations, and Universality (Springer, Berlin Heidelberg, Berlin, Heidelberg, 2007), pp. 122–133. 140. R. Gutiérrez-Escudero, M. Pérez-Jiménez, and M. Rius-Font, Characterizing tractability by tissue-like P systems. In Gh. P˘ aun, M. Pérez-Jiménez, A. Riscos-N´ un ˜ez, G. Rozenberg, and A. Salomaa, Membrane Computing, vol. 5957, Lecture Notes in Computer Science (Springer, Berlin Heidelberg, Berlin, Heidelberg, 2010), pp. 289–300. 141. M. Garey and D. Johnson, Computers and Intractability. A Guide to the Theory of NP-Completeness (W.H. Freeman and Company, 1979). 142. M. Pérez-Jiménez, A. Riscos-N´ un ˜ez, M. Rius-Font, and F. RomeroCampero, A polynomial alternative to unbounded environment for tissue P systems with cell division. Int. J. Comput. Math. 90(4), 760–775 (2013). 143. M. Colomer, A. Margalida, and M. Pérez-Jiménez, Population dynamics P system (PDP) models: A standardized protocol for describing and applying novel bio-inspired computing tools. Plos One 4, 1–13 (2013). 144. A. Regev and E. Shapiro, Cellular abstractions: Cells as computations. Nature 419(6905), 343–343 (2002).

page 331

August 2, 2021

332

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

145. M. Cardona, M. Colomer, M. Pérez-Jiménez, D. Sanuy, and A. Margalida, A P system modeling an ecosystem related to the bearded vulture, 6th Brainstorming Week on Membrane Computing 6, 51–66 (2008). 146. M. Cardona, M. Colomer, M. Pérez-Jiménez, D. Sanuy, and A. Margalida, Modeling ecosystems using P systems: The bearded vulture, a case study, Lect. Notes Comput. Sci. 5391, 137–156 (2009). Membrane Computing, 9th International Workshop, WMC 2008, Edinburgh, UK, July 28–31, 2008, Revised Selected and Invited Papers. 147. M. Cardona, M. Colomer, A. Margalida, I. Pérez-Hurtado, M. PérezJiménez, and D. Sanuy, A P system based model of an ecosystem of some scavenger birds. Lect. Notes Comput. Sci. 5957, 182–195 (2010). Membrane Computing, 10th International Workshop, WMC 2009, Curtea de Arge¸s, Romania, August 24–27, 2009, Revised Selected and Invited Papers. 148. M. Cardona, M. Colomer, A. Margalida, A. Palau, I. Pérez-Hurtado, M. Pérez-Jiménez, and D. Sanuy, A computational modeling for real ecosystems based on P systems. Nat. Comput. 10, 39–53 (2011). 149. M. Colomer, A. Margalida, D. Sanuy, and M. Pérez-Jiménez, A bioinspired computing model as a new tool for modeling ecosystems: The avian scavengers as a case study. Ecol. Modell. 222, 33–47 (2011). 150. M. Colomer, S. Lav´ın, I. Marco, A. Margalida, I. Pérez-Hurtado, M. PérezJiménez, D. Sanuy, E. Serrano, and L. Valencia-Cabrera, Modeling population growth of Pyrenean Chamois (Rupicapra p. pyrenaica) by using P systems. Lect. Notes Comput. Sci. 6501, 144–159 (2011). Membrane Computing, 11th International Conference, CMC 2010, Jena, Germany, August 24–27, 2010. Revised Selected Papers. ´ Mart´ınez-del-Amor, I. Pérez-Hurtado, M. Pérez-Jiménez, A. Riscos151. M.A. N´ un ˜ez, and M. Colomer, A new simulation algorithm for multienvironment probabilistic P systems. In K. Li, Z. Tang, R. Li, A. Nagar, and R. Thamburaj (eds.), IEEE Fifth International Conference on Bio-inpired Computing: Theories and Applications (BIC-TA 2010), vol. 1 (IEEE Press., Changsha, China, 2010), pp. 59–68. 152. M. Colomer, I. Pérez-Hurtado, M. Pérez-Jiménez, and A. Riscos-N´ un ˜ez, Simulating tritrophic interactions by means of P systems. In A. Nagar, R. Thamburaj, K. Li, Z. Tang, and R. Li (eds.), IEEE Fifth International Conference on Bio-inpired Computing: Theories and Applications (BIC-TA 2010), vol. 2, (IEEE Press., Liverpool, UK, 2010), pp. 1621–1628. 153. M. Colomer, I. Pérez-Hurtado, A. Riscos-N´ un ˜ez, and M. Pérez-Jiménez, Comparing simulation algorithms for multienvironment probabilistic P system over a standard virtual ecosystem. Nat. Comput. 11, 369–379 (2011). 154. M. Colomer, A. Margalida, L. Valencia-Cabrera, and A. Palau, Application of a computational model for complex fluvial ecosystems: The population dynamics of zebra mussel Dreissena polymorpha as a case study. Ecol. Complex 20, 116–126 (2014).

page 332

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

333

155. Z. Huang, G. Zhang, D. Qi, H. Rong, M. Pérez-Jiménez, and L. ValenciaCabrera, Application of probabilistic membrane systems to model giant panda population data. Comput. Syst. Appl. 26, 252–256 (2017). 156. H. Tian, G. Zhang, H. Rong, M. Pérez-Jiménez, L. Valencia-Cabrera, P. Chen, R. Hou, and D. Qi, Population model of giant panda ecosystem based on population dynamics P system. J. Comput. Appl. 38, 1488–1493 (2018). 157. Y. Duan, G. Zhang, D. Qi, L. Valencia-Cabrera, H. Rong, and M. PérezJiménez, A review of membrane computing models for ecosystems and a case study on giant panda. Complexity, 1312824 (2020). 158. C. Buiu, C. Vasile, and O. Arsene, Development of membrane controllers for mobile robots. Inform. Sci. 187, 33–51 (2012). 159. X. Wang, G. Zhang, F. Neri, T. Jiang, J. Zhao, M. Gheorghe, F. Ipate, and R. Lefticaru, Design and implementation of membrane controllers for trajectory tracking of nonholonomic wheeled mobile robots. Integr. Comput.-Aided Eng. 23, 15–30 (2016). 160. X. Wang, G. Zhang, X. Gou, P. Paul, F. Neri, H. Rong, Q. Yang, and H. Zhang, Multi-behaviors coordination controller design with enzymatic numerical P systems for robots. Integr. Comput.-Aided Eng. 28(2), 119– 150 (2020). 161. A. Florea and C. Buiu, Modelling multi-robot interactions using a generic controller based on numerical P systems and ROS. In IEEE Proceedings of 2017 9th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), June 29– July 01, vol. 1 (IEEE Press, 2017), pp. 1–6. 162. A. Florea and C. Buiu, A distributed approach to the control of multi-robot systems using XP colonies. Integr. Comput.-Aided Eng. 25, 15–29 (2018). 163. X. Wang, G. Zhang, J. Zhao, H. Rong, F. Ipate, and R. Lefticaru, A modified membrane-inspired algorithm based on particle swarm optimization for mobile robot path planning. Int. J. Comput. Commun. Control 10, 732–745 (2015). 164. I. Pérez-Hurtado, M. Pérez-Jiménez, G. Zhang, and D. Orellana-Martin, Simulation of rapidly-exploring random trees in membrane computing with P-lingua and automatic programming. Int. J. Comput. Commun. Control 13, 1007–1031 (2018). ´ Mart´ınez-del-Amor, G. Zhang, F. Neri, and 165. I. Pérez-Hurtado, M. A. M. Pérez-Jiménez, A membrane parallel rapidly-exploring random tree algorithm for robotic motion planning. Int. Comput.-Aided Eng. 1–18 (2020). 166. H. Peng, J. Wang, M. Pérez-Jiménez, H. Wang, J. Shao, and T. Wang, Fuzzy reasoning spiking neural P system for fault diagnosis. Inform. Sci. 235, 106–116 (2013). 167. G. Xiong, D. Shi, L. Zhu, and X. Duan, A new approach to fault diagnosis of power systems using fuzzy reasoning spiking neural P systems. Math. Probl. Eng. 815352 (2013).

page 333

August 2, 2021

334

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

168. T. Wang, G. Zhang, H. Rong, and M. Pérez-Jiménez, Application of fuzzy reasoning spiking neural P systems to fault diagnosis. Int. J. Comput. Commun. Control 9, 786–799 (2014). 169. T. Wang, G. Zhang, and M. Pérez-Jiménez, Fault diagnosis models for electric locomotive systems based on fuzzy reasoning spiking neural P systems. In M. Gheorghe, G. Rozenberg, A. Salomaa, P. Sosik, and C. Zandron (eds.), Membrane Computing (Springer, 2014), pp. 385–395. 170. T. Wang, G. Zhang, J. Zhao, Z. He, J. Wang, and M. Pérez-Jiménez, Fault diagnosis of electric power systems based on fuzzy reasoning spiking neural P systems. IEEE Trans. Power Syst. 30, 1182–1194 (2015). 171. T. Wang, G. Zhang, M. Pérez-J´ımenez, and J. Cheng, Weighted fuzzy reasoning spiking neural P systems: Application to fault diagnosis in traction power supply systems of high-speed railways. J. Comput. Theoret. Nanosci. 12, 1103–1114 (2015). 172. Y. He, T. Wang, K. Huang, G. Zhang, and M. Peérez-Jiménez, Fault diagnosis of metro traction power systems using a modified fuzzy reasoning spiking neural P system. Romanian. J. Inform. Sci. Technol. 18, 256–272 (2015). 173. Y. Yahya, A. Qian, and A. Yahya, Power transformer fault diagnosis using fuzzy reasoning spiking neural P systems. J. Intell. Learn. Syst. Appl. 8, 77–91 (2016). 174. J. Wang, H. Peng, W. Yu, J. Ming, M. Pérez-Jiménez, C. Tao, and X. Huang, Interval-valued fuzzy spiking neural P systems for fault diagnosis of power transmission networks. Eng. Appl. Artif. Intell. 82, 102–109 (2019). 175. H. Peng, J. Wang, J. Ming, P. Shi, M. Pérez-Jiménez, W. Yu, and C. Tao, Fault diagnosis of power systems using intuitionistic fuzzy spiking neural P systems. IEEE Trans. Smart Grid. 9, 4777–4784 (2018). 176. H. Rong, K. Yi, G. Zhang, J. Dong, P. Paul, and Z. Huang, Automatic implementation of fuzzy reasoning spiking neural P systems for diagnosing faults in complex power systems. Complexity 2019 (2019). 177. G. Zhang, H. Rong, F. Neri, and M. Pérez-J´ımenez, An optimization spiking neural P system for approximately solving combinatorial optimization problems. Int. J. Neural Syst. 24, 1–16 (2014). 178. T. Wang, S. Zeng, G. Zhang, M. Pérez-Jiménez, and J. Wang, Fault section estimation of power systems with optimization spiking neural P systems. Rom. J. Inform. Sci. Technol. 18, 240–255 (2015). 179. T. Min, J. Wang, H. Peng, and P. Shi, Application of adaptive fuzzy spiking neural P systems in fault diagnosis of power systems. Chin. J. Electr. 23, 87–92 (2014). 180. J. Wang, H. Peng, M. Tu, M. Pérez-Jiménez, and P. Shi, A fault diagnosis method of power systems based on an improved adaptive fuzzy spiking neural P systems and PSO algorithms. Chin. J. Electr. 25, 320–327 (2016). 181. G. Zhang, C. Liu, and H. Rong, Analyzing radar emitter signals with membrane algorithms. Math. Comput. Modell. 52, 1997–2010 (2010).

page 334

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

335

182. G. Zhang, M. Gheorghe, and Y. Li, A membrane algorithm with quantuminspired subalgorithms and its application to image processing. Nat. Comput. 11, 701–717 (2012). 183. G. Zhang, J. Cheng, M. Gheorghe, and Q. Meng, A hybrid approach based on differential evolution and tissue membrane systems for solving constrained manufacturing parameter optimization problems. Appl. Soft Comput. 13, 1528–1542 (2013). 184. J. Xiao, Y. Liu, S. Zhang, and P. Chen, An adaptive membrane evolutionary algorithm for solving constrained engineering optimization problems. J. Univ. Comput. Sci. 23, 652–672 (2017). 185. M. Gutiérrez-Naranjo, M. Pérez-Jiménez, and A. Riscos-N´ un ˜ez, Available membrane computing software. In G. Ciobanu, and Gh. P˘ aun and M. J. Pérez-Jiménez (eds.), Applications of Membrane Computing (Springer Berlin Heidelberg, 2006), pp. 411–436. 186. D. D´ıaz-Pernil, C. Graciani-D´ıaz, M. Gutiérrez-Naranjo, I. Pérez-Hurtado, and M. Pérez-Jiménez, Software for P systems. In Gh. P˘ aun, G. Rozenberg and A. Salomaa (eds.), The Oxford Handbook of Membrane Computing (Oxford University Press, 2010), pp. 437–454. 187. S. Raghavan and K. Chandrasekaran, Tools and simulators for membrane computing-a literature review. In M. Gong, L. P. T. Song, and G. Zhang (eds.), Bio-inspired Computing — Theories and Applications (Springer Singapore, 2016), pp. 249–277. ´ Mart´ınez-del-Amor, M. Garc´ıa-Quismondo, L. Mac´ıas-Ramos, 188. M. A. L. Valencia-Cabrera, A. Riscos-N´ un ˜ez, and M. Pérez-Jiménez, Simulating P systems on GPU devices: A survey. Fundamenta Informaticae 136, 269– 284 (2015). ´ Mart´ınez-del-Amor, L. Mac´ıas-Ramos, L. Valencia-Cabrera, and 189. M. A. M. Pérez-Jiménez, Parallel simulation of Population Dynamics P systems: Updates and roadmap. Nat. Comput. 15(4), 565–573 (2015). ´ Mart´ınez190. J. Carandang, F. Cabarle, H. Adorna, N. Hernandez, and M. A del-Amor, Handling non-determinism in Spiking Neural P systems: Algorithms and simulations. Fundamenta Informaticae 164, 139–155 (2019). ´ Mart´ınez-del-Amor, and 191. L. Valencia-Cabrera, D. Orellana-Mart´ın, M. A M. Pérez-Jiménez, From super-cells to robotic swarms: Two decades of evolution in the simulation of P systems. Bull. Int. Membr. Comput. Soc. 4, 65–87 (2017). 192. G. Ciobanu and G. Wenyuan, A parallel implementation of transition P sysaun (eds.), Pre-proceedings tems. In A. Alhazov, C. Mart´ın-Vide, and Gh. P˘ of the Workshop on Membrane Computing, Tarragona (July 17–22, 2003), pp. 169–184. 193. B. Petreska and C. Teuscher, A reconfigurable hardware membrane system. In C. Mart´ın-Vide, G. Mauri, Gh. P˘ aun, G. Rozenberg, and A. Salomaa (eds.), Membrane Computing, International Workshop, WMC 2003, Tarragona, Spain, July, 17–22, 2003, Revised Papers, vol. 2933, Lecture Notes in Computer Science (Springer, 2003), pp. 269–285.

page 335

August 2, 2021

336

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

194. D. D´ıaz-Pernil, I. Pérez-Hurtado, M. Pérez-Jiménez, and A. Riscos-N´ un ˜ez, A P-lingua programming environment for membrane computing. In D. W. Corne, P. Frisco, Gh. P˘ aun G. Rozenberg, and A. Salomaa (eds.), Membrane Computing (Springer, Berlin Heidelberg, 2009), pp. 187–203. 195. The P-Lingua web page, last accessed February 2020. http://www.p-lingua. org. 196. E. Keinan. Membrane computing Google Patents (2009). 197. Meta P-Lab web site (MP virtual laboratory), last accessed February 2020. http://mplab.scienze.univr.it/index.html. 198. J. Blakes, J. Twycross, F. Romero-Campero, and N. Krasnogor, The infobiotics workbench: An integrated in silico modeling platform for systems and synthetic biology. Bioinformatics 27, 3323–3324 (2011). 199. S. Konur, L. Mierl˘ a, F. Ipate, and M. Gheorghe, kPWorkbench: A software suit for membrane systems. SoftwareX 11, 100407 (2020). 200. M. Garc´ıa-Quismondo, R. Gutiérrez-Escudero, I. Pérez-Hurtado, M. PérezJiménez, and A. Riscos-N´ un ˜ez, An overview of P-lingua 2.0. Lect. Notes Comput. Sci. 5957, 264–288 (2010). Membrane Computing, 10th International Workshop, WMC 2009, Curtea de Arge¸s, Romania, August 24–27, 2009, Revised Selected and Invited Papers. 201. I. Pérez-Hurtado, L. Valencia-Cabrera, M. Pérez-Jiménez, M. Colomer, and A. Riscos-N´ un ˜ez, MeCoSim: A general purpose software tool for simulating biological phenomena by means of P Systems. In IEEE Fifth International Conference on Bio-inspired Computing: Theories and Applications (BIC-TA 2010). VM. (2010), pp. 637–643. 202. P. Guo, C. Quan, and L. Ye, UPSimulator: A general P system simulator. Knowl.-Based Syst. 170, 20–25 (2019). 203. D. Kirk and W.-m.W. Hwu, Programming Massively Parallel Processors: A Hands-on Approach, 3rd edn. (Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2016). ´ Mart´ınez-del-Amor, M. Garc´ıa-Quismondo, L. Mac´ıas-Ramos, 204. M. A. L. Valencia-Cabrera, A. Riscos-N´ un ˜ez, and M. Pérez-Jiménez, Simulating P systems on GPU devices: A survey. Fundamenta Informaticae 136(3), 269–284 (2015). ´ Mart´ınez-del-Amor, D. Orellana-Mart´ın, I. Pérez-Hurtado, L. Valencia205. M. A. Cabrera, A. Riscos-N´ un ˜ez, and M. Pérez-Jiménez, Design of specific P systems simulators on GPUs. In T. Hinze, G. Rozenberg, A. Salomaa, and C. Zandron (eds.), Membrane Computing, vol. 11399, Lecture Notes in Computer Science (Springer International Publishing, 2019), pp. 202–207. ´ Mart´ınez-del-Amor, M. Pérez206. J. Cecilia, J. Garc´ıa, G. Guerrero, M. A. Jiménez, and M. Ujald´ on, The GPU on the simulation of cellular computing models. Soft Comput. 16(2), 231–246 (2012). 207. The PMCGPU (Parallel simulators for Membrane Computing on the GPU) project website, last accessed February 2020. http://sourceforge.net/p/ pmcgpu.

page 336

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

337

´ Mart´ınez-del-Amor, I. Pérez208. J. Cecilia, J. Garc´ıa, G. Guerrero, M. A. Hurtado, and M. Pérez-Jiménez, Simulation of P systems with active membranes on CUDA, Brief. Bioinform. 11(3), 313–322 (2010). ´ Mart´ınez-del-Amor, I. Pérez209. J. Cecilia, J. Garc´ıa, G. Guerrero, M. A. Hurtado, and M. Pérez-Jiménez, Simulating a P system based efficient solution to SAT by using GPUs. J. Logic Algebr. Programm. 79(6), 317–325 (2010). 210. A. Maroosi, R. Muniyandi, E. Sundararajan, and A. M. Zin, Parallel and distributed computing models on a graphics processing unit to accelerate simulation of membrane systems. Simul. Modell. Pract. Theory. 47, 60–78 (2014). ´ Mart´ınez-del-Amor, J. Pérez-Carrasco, and M. Pérez-Jiménez, 211. M. A. Characterizing the parallel simulation of P systems on the GPU. Int. J. Unconvent. Comput. 9(5–6), 405–424 (2013). 212. F. Pe˜ na-Cantillana, D. D´ıaz-Pernil, H. Christinal, and M. GutiérrezNaranjo, Implementation on CUDA of the smoothing problem with tissuelike P systems. Int. J. Natural Comput. Res. 2(3), 25–34 (2011). 213. D. D´ıaz-Pernil, A. Berciano, F. Pe˜ na-Cantillana, and M. Gutiérrez-Naranjo, Segmenting images with gradient-based edge detection using membrane computing, Pattern Recogn. Lett. 34(8), 846–855 (2013). ´ Mart´ınez, A spiking neural P system 214. F. Cabarle, H. Adorna, and M. A. simulator based on CUDA. In M. Gheorghe, Gh. P˘ aun, G. Rozenberg, A. Salomaa, and S. Verlan (eds.), Membrane Computing, Lecture Notes in Computer Science, (Springer Berlin Heidelberg, Berlin, Heidelberg, 2012), pp. 87–103. ´ Mart´ınez-del-Amor, and M. Pérez-Jiménez, 215. F. Cabarle, H. Adorna, M. A. Improving GPU simulations of spiking neural P systems. Rom. J. Inform. Sci. Technol. 15(1), 5–20 (2012). ´ Mart´ınez216. J. Carandang, J. Villaflores, F. Cabarle, H. Adorna, and M. A. del-Amor, CuSNP: Spiking neural P systems simulators in CUDA. Rom. J. Inform. Sci. Technol. 20(1), 57–70 (2017). ´ Mart´ınez-del-Amor, and M. Pérez-Jiménez, Sim217. L. Mac´ıas-Ramos, M. A. ulating FRSN P systems with real numbers in P-lingua on sequential and CUDA platforms. In G. Rozenberg, A. Salomaa, J. Sempere, and C. Zandron (eds.), Membrane Computing (Springer International Publishing, 2015), pp. 262–276. ´ Mart´ınez-del-Amor, I. Pérez-Hurtado, A. Gastalver-Rubio, A. Elster, 218. M. A. and M. Pérez-Jiménez, Population dynamics P systems on CUDA. In D. Gilbert and M. Heiner (eds.), Computational Methods in Systems Biology, vol. 7605, Lecture Notes in Computer Science (Springer, Berlin Heidelberg, 2012), pp. 247–266. ´ Mart´ınez-del-Amor, I. Pérez-Hurtado, D. Orellana-Mart´ın, and 219. M. A. M. Pérez-Jiménez, Adaptative parallel simulators for bioinspired computing models. Future Gen. Comput. Syst. 107, 469–484 (2020).

page 337

August 2, 2021

338

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch08

E. Csuhaj-Varj´ u et al.

220. M. Garc´ıa-Quismondo, L. Mac´ıas-Ramos, and M. Pérez-Jiménez, Implementing Enzymatic Numerical P Systems for AI Applications by Means of Graphic Processing Units. In J. Kelemen, J. Romportl, and E. Zackova (eds.), Beyond Artificial Intelligence: Contemplations, Expectations, Applications (Springer, Berlin Heidelberg, 2013), pp. 137–159. ´ Mart´ınez-del-Amor, G. Zhang, F. Neri, and 221. I. Pérez-Hurtado, M. A. M. Pérez-Jiménez, A membrane parallel rapidly-exploring random tree algorithm for robotic motion planning. Integr. Comput.-Aided Eng. 27(2), 121–138 (2020). 222. F. Ipate, R. Lefticaru, L. Mierl˘ a, L. Valencia-Cabrera, H. Han, G. Zhang, C. Dragomir, M. Pérez-Jiménez, and M. Gheorghe, Kernel P systems: Applications and implementations. In Proc. 8th Int. Conf. on Bio-Inspired Computing: Theories and Applications, vol. 2012, Advances in Intelligent Systems and Computing, (2013), pp. 1081–1089. 223. N. Elkhani, R. Muniyandi, and G. Zhang, Multi-objective binary PSO with kernel P system on GPU. Int. J. Comput. Commun. Control 13, 323–336 (2018). ´ Mart´ınez-del-Amor, On 224. R. Juayong, F. Cabarle, H. Adorna, and M. A. the simulations of evolution-communication P systems with energy without antiport rules for GPUs. In Proc. Tenth Brainstorming Week on Membrane Computing, vol. 2012, (2012), pp. 267–290. 225. X. Zhang, B. Wang, Z. Ding, J. Tang, and J. He, Implementation of membrane algorithms on GPU. J. Appl. Math. 2014 (07, 2014). 226. G. Mealy, A method for synthesizing sequential circuits. The Bell Syst. Tech. J. 34(5), 1045–1079 (1955). 227. B. Petreska and C. Teuscher, A reconfigurable hardware membrane system. In C. Mart´ın-Vide, G. Mauri, Gh. P˘ aun, G. Rozenberg, and A. Salomaa (eds.), Membrane Computing, International Workshop, WMC 2003, Tarragona, Spain, July 17–22, 2003, Revised Papers, vol. 2933, Lecture Notes in Computer Science (Springer, 2003), pp. 269–285. 228. V. Nguyen, D. Kearney, and G. Gioiosa, Balancing performance, flexibility, and scalability in a parallel computing platform for membrane computing applications. In G. Eleftherakis, P. Kefalas, Gh. P˘ aun, G. Rozenberg, and A. Salomaa (eds.), Membrane Computing, 8th International Workshop, WMC 2007, Thessaloniki, Greece, June 25–28, 2007 Revised Selected and Invited Papers, vol. 4860, Lecture Notes in Computer Science (Springer, 2007), pp. 385–413. 229. V. Nguyen, D. Kearney, and G. Gioiosa, An implementation of membrane computing using reconfigurable hardware. Comput. Informat. 27(3+), 551– 569 (2008). 230. V. Nguyen, D. Kearney, and G. Gioiosa, An algorithm for non-deterministic object distribution in P systems and its implementation in hardware. In D. Corne, P. Frisco, Gh. P˘ aun, G. Rozenberg, and A. Saloma (eds.), Membrane Computing — 9th International Workshop, WMC 2008, Edinburgh, UK, July 28–31, 2008, Revised Selected and Invited Papers, vol. 5391, Lecture Notes in Computer Science (Springer, 2008), pp. 325–354.

page 338

August 2, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Membrane Computing Concepts

b4205-v1-ch08

339

231. V. Nguyen, D. Kearney, and G. Gioiosa, A region-oriented hardware implementation for membrane computing applications. In Gh. P˘ aun, M. Pérez-Jiménez, A. Riscos-N´ un ˜ez, G. Rozenberg, and A. Salomaa (eds.), Membrane Computing, 10th International Workshop, WMC 2009, Curtea de Arges, Romania, August 24–27, 2009. Revised Selected and Invited Papers, vol. 5957, Lecture Notes in Computer Science (Springer, 2009), pp. 385–409. 232. V. Nguyen, D. Kearney, and G. Gioiosa, An extensible, maintainable and elegant approach to hardware source code generation in Reconfig-P. J. Log. Algebr. Program 79(6), 383–396 (2010). 233. V. Nguyen, An implementation of the parallelism, distribution and nondeterminism of membrane computing models on reconfigurable hardware. PhD Thesis, University of South Australia (2010). 234. S. Verlan and J. Quiros, Fast hardware implementations of P systems. In: E. Csuhaj-Varj´ u, M. Gheorghe, G. Rozenberg, A. Salomaa, and Gy. Vaszil (eds.), Membrane Computing — 13th International Conference, CMC 2012, Budapest, Hungary, August 28–31, 2012, Revised Selected Papers, vol. 7762, Lecture Notes in Computer Science (Springer, 2012), pp. 404–423. 235. J. Quiros, S. Verlan, J. Viejo, A. Mill´ an, and M. Bellido, Fast hardware implementations of static P systems. Comput. Inform 35(3), 687–718 (2016). 236. R. Freund, A. Leporati, G. Mauri, A. Porreca, S. Verlan, and C. Zandron, Flattening in (tissue) P systems. In A. Alhazov, S. Cojocaru, M. Gheorghe, Yu. Rogozhin, G. Rozenberg, and A. Salomaa (eds.), Membrane Computing — 14th International Conference, CMC 2013, Chi¸sin˘ au, Republic of Moldova, August 20–23, 2013, Revised Selected Papers, vol. 8340, Lecture Notes in Computer Science (Springer, 2013), pp. 173–188. 237. R. Freund and S. Verlan, A formal framework for static (tissue) P systems. In G. Eleftherakis, P. Kefalas, Gh. P˘ aun, G. Rozenberg, and A. Salomaa (eds.), Membrane Computing, 8th International Workshop, WMC 2007, Thessaloniki, Greece, June 25–28, 2007 Revised Selected and Invited Papers, vol. 4860, Lecture Notes in Computer Science (Springer, 2007), pp. 271–284. 238. Z. Shang, S. Verlan, and G. Zhang, Hardware implementation of numerical P systems. In Gh. P˘ aun (ed.), Proceedings of the 20th International Conference on Membrane Computing, CMC20, August 5–8, 2019, Curtea de Arge¸s (Romania, 2019), pp. 463–474. 239. Z. Shang, S. Verlan, G. Zhang, and I. Pérez-Hurtado, FPGA implementation of robot obstacle avoidance controller based on enzymatic numerical P systems. In G. Zhang, L. Pan, and X. Liu (eds.), Pre-Proceedings of The 8th Asian Conference on Membrane Computing (ACMC2019), November 14–17, 2019, Xiamen (China, 2019), pp. 184–214. 240. A. E. Porreca, N. Murphy and M. J. Pérez-Jiménez, An optimal frontier of the efficiency of tissue P systems with cell division. In M. Garc´ıaQuismondo, L. F. Mac´ıas-Ramos, I. Pérez-Hurtado and L. ValenciaCabrera, Proceedings of the Tenth Brainstorming Week on Membrane Computing, Volume II, Report RGNC 01/2012, Fénix Editora (Sevilla, Spain, 2012), pp. 187–203.

page 339

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 10, 2021

9:30

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch09

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 0009

Chapter 9

Computing with Modest Resources: How to Have Your Cake and Eat it Too Vasileios Athanasiou and Zoran Konkoli∗ Department of Microtechnology and Nanoscience — MC2, Chalmers University of Technology, Gothenburg SE-41296, Sweden ∗ [email protected] This chapter focuses on the concept of physical computation. Physical computation implies implementation costs, either in terms of materials that need to be used, the engineering effort that has to be invested into building the device, or once built the energy needed to operate it. The important conceptual primitives will be discussed that are necessary for understanding the interplay between the costs of realizing physical computation and the ultimate computing capacity one can harvest from the device.

9.1. Introduction Computation is physical and because of that it does not come for free. The computation has to be realized, and it costs in resources, both in material, human effort, and energy.1, 2 It is important to understand the balance between these costs of realizing computation and measure it against the resulting computing power of the device.3–5 This subtle balance between the two might be of the central importance to the mankind. Can we continue the information technology race unchallenged in the future given the limited amount of resources at our disposal? Of course, the question is irrelevant if one is of the opinion that we do not need to continue building ever faster computing machines. One might argue that the financial resources 341

page 341

August 2, 2021

342

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch09

V. Athanasiou and Z. Konkoli

at our disposal are limited, and that the computers are sufficiently fast already, and that we should employ our limited resources elsewhere (e.g., to build hospitals). However, one could easily argue in the opposite direction, that the need for better performing information processing technologies will increase, and stretch the argument towards the claim that building more powerful information processing devices is essential for our survival. Given the trends in the last century, one would be perhaps naive to think otherwise.4 Nevertheless, regardless of the position one takes in the above discussion, one thing is clear: it is always better to achieve the same performance with less cost. The chapter at hand focuses on analyzing the factors that contribute to controlling the balance between the resources needed to implement the computation and the resulting computing power that comes out of the device. The goal is to provide some concrete guidelines of how to think about this, and ultimately envision some strategy for realizing resource efficient computation. Computation at every level is physical, the one being done by a simple organism or by humans or, ultimately, the one engineered by a human: • Even a tiny organism such as a bacterium finding food computes: does a bacterium think in some metaphysical way when performing chemotaxis? Of course not, it is all physical and well understood. The motion of the bacterium is highly regulated by its biochemical machinery, which involves a complex interplay between sensory input, signalling cascades, and gene expression, all governing the motion of the proteins in the living cell, with the flagella motor as the point of convergence of all these processes. Even the purest of computation, the human thought, has to be realized physically, being a by product of the neural firing patterns in the brain, the process that at the microscopic level is regulated by opening and closing of various types of ion channels. • Mankind has been designing computing machines for thousands of years, in principle since we started developing rudimentary counting systems and ways of representing them, using fingers, marbles, or tying a complex set of knots. The same holds for

page 342

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computing with Modest Resources

b4205-v1-ch09

343

contemporary human beings. We have developed various ingenious ways to achieve computation, starting from designing mechanical computers, or finally using electrons traveling through a material as in the modern computer. A great deal of effort has been spent on building computing machines that have been carefully designed from scratch with information processing in mind. However, the notion of interest to us is the one of “accidental computation”. Though it might seem strange, we posit that thinking in these terms can give us some general ideas of how to achieve resource efficient computation in general. Let us pose some motivational questions: Can the system that has not been designed for computation from scratch be used for that purpose? How much can we compute with a given system? Note that another, more philosophical approach would be to ask a related but slightly different question: Is the computation inherent in everything around us? Indeed, both questions are strongly related and it is hard to separate them. When thinking of these, a rather practical attitude is being adopted here, with the following agenda in mind: Provided a fair argument can be put forward that an “accidental computation” is indeed possible, and ubiquitous in nature, then that ought to encourage us to further think about the opportunities we might be missing in terms of engineering novel information processing solutions. In this chapter we posit some interesting paradoxes, arguing that one should be asking different types of questions from the ones discussed above. We specifically focus on the notion of computing with dynamical systems that are just there, that is, that have not been specifically designed for computation, at least not from scratch. Indeed, we are entering a rather complex subject, a philosophical mine-field, where things are not what they seem. A typical example is the notion of a computing rock.6 In his now legendary book appendix Hilary Putnam illustrated that even a system with an extremely rudimentary dynamics has the capacity to operate as any finite state I/O automaton. Putnam has suggested a construct in which a simple rock can be turned into an

page 343

August 2, 2021

344

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch09

V. Athanasiou and Z. Konkoli

automaton with an arbitrary degree of complexity. Of course, there are a plethora of ways to criticize this idea of the rock as a universal computing machine, but we shall not enter into such discussions. Suffice to say, information processing engineers are not using rocks to design computers, and had it been possible, laptops would have have been much, much cheaper. The point being made here is that the issue of physical computation is a highly non-trivial one, riddled with paradoxes and pitfalls. Understanding why we cannot use a rock to build a laptop is precisely the kind of question we are going to be asking. Insisting on the question, one might further ask whether there is a special type of a rock that could be used to build a laptop. Ultimately, if we have to give up on the rock, what else could we use? How much more complex than a rock does the system have to be? Clearly, such thoughts naturally motivate the need to operate with the notion of “computing capacity” and understand the resources that are needed to realize it. How can we measure computing capacity both in terms of performance and in terms of the costs? What are the factors that shape all these features? In the following text we argue that the ability of a dynamical system to achieve phase separation is the key feature that ensures resource efficient computation. In other words, phase space separation ensures favorable balance between computing capacity and the cost of implementation. And here comes the main insight, in the following we carefully argue that this phase separation might be achievable with a modest implementation cost. Naturally, this will not be true for every dynamical system, but there is a large class of systems that might fit the bill. Though the entire discussion might seem abstract, it has a strong practical side to it. The discussion is of paramount importance for designing information processing applications where the goal is to collect and process information in situ, without sending it to a centralized information processing unit. This happens in various application contexts related to system monitoring and the subsequent on-line data analysis. One of the key requirements is that the energy consumption of such a computing device should be low. “How can we design efficient computing devices that are energy efficient?” is

page 344

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computing with Modest Resources

b4205-v1-ch09

345

the key question. Why is this a key question? Well, it is not easy to design such devices. For example, the bigger we make them, the smarter they are. However, the energy consumption grows with the size of the computing machine, or more precisely with the number of components we use to make it. The modern computer is an energy hungry beast that guzzles energy to operate. We cannot attach a laptop to a bird to observe how a flock of birds behaves in the wild. What can we do instead? In the following we provide some examples arguing that the right approach might be a middle ground, where we exploit the ideas pertinent to unconventional computation, harvesting what is around, and tailor/tweak these systems for special purpose computation. This somewhat resembles the idea of “functional diversification” which has been around for quite a long time now.7 A brute force response to the functional diversification challenge is to consider complex dynamical systems, the ones that already harbor an enormous computing potential, which comes from the interaction between the large number of components. However, what is new, is that in the following text we challenge the notion that the bigger is necessarily better. In fact, in various ways we argue it is not, since it might not always be necessary to built such large computing systems. In the following we offer some thoughts on the options that can be exploited to realize the agenda of smart energy efficient physical computation. Every subsection discusses one such option, and the final section provides an overarching synthesis of the principles that are common to all ideas.

9.2. Why Bigger is Not Necessarily Smarter Bigger usually implies smarter, of course. A typical example of achieving brute force computing power by scaling up the number of computing elements is the artificial neural network, and the recent breakthrough in deep learning techniques. The intelligence of the system is known to increase with the number of neurons, plain and simple. The only problem one had earlier was to train such large networks. It was not easy to adjust (train) the system towards a

page 345

August 2, 2021

346

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch09

V. Athanasiou and Z. Konkoli

specific information processing goal. Since the training issue has been solved, we can easily cross borders with automatic passport control, though possibly have to watch out for self-driving cars once exiting the airport. Reservoir computing has been suggested as another option for taking care of the network training issue. The underlying idea is very simple, if it is hard to train the system, do not train. A skeleton of a reservoir computer is shown in Figure 9.1. A reservoir computer is essentially a dynamical system that accepts a time-series input and has a state that can be observed to produced the output. Instead of wasting resources on training the whole system, one trains the simple readout layer. The intelligence should come from the intrinsic ability of the system to compute. To achieve intelligence, one should make the system complex enough, arrange for extensive inter connectivity and feedback (recurrent neural networks). Why should such a strategy work? It is possible to justify these ideas rigorously. If certain well-defined mathematical properties can be ensured, then the dynamical system (the reservoir) can be used in this way. A pedagogical explanation of what these properties represent can be found in Ref. [3]. The layer can be something simple that is easy to train. In fact, often a simple liner readout layer is enough. This is what makes the reservoir computing approach powerful.

Figure 9.1. Reservoir computer. Input: q(t) denotes the time series data that is the input to the computation. R denotes a dynamical system that “accepts” the input signal q. At a given time instance t the state of the system x(t) changes depending on the previous state of the reservoir and the input provided. The readout layer is used to inspect the state of the dynamical system and produce the output.

page 346

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computing with Modest Resources

b4205-v1-ch09

347

Nevertheless, despite the ease of training, the reservoir computing agenda implies the same brutal need for engineering “large” structures, in the same way as deep learning, both thrive in the landscape of extensively large systems with many components. The point being made here is that such systems might be hard to engineer (if not available in nature) and might be equally energy consuming. Of course, it depends on what one chooses as the reservoir.

9.3. Why Smaller Might be Smarter The recent reservoir computing advances do teach us an important lesson. In the end it is all about clever balance of computing resources. The ease of training implies that there is an intrinsic intelligence in the system. But where does the intelligence reside? Perhaps one of the main achievements of the reservoir computing community has been the insight that the intelligence of a reservoir computer comes from intrinsic feedback mechanisms that are an essential part of random recurrent neural networks. An interesting twist to the process of development and maturing of the reservoir computing ideas is the notion of delayed feedback. If the feedback is important, why not engineer it? There it has been shown that one can thin down the size of the system provided it is being equipped with an additional delayed feedback mechanism. Admittedly, one trades one engineering requirement for the other, hoping to gain in reducing the system size without loosing the computing power. Sure enough, if the delayed feedback can be engineered with a modest implementation cost, then why not. In fact, it has been shown that if a delayed feedback is exploited a single node can compute quite a lot.8 In principle what happens is that the delayed feedback mechanism creates virtual nodes. This is exactly the idea we would like to spin on, as awareness of the past, that is, some dependence on the past, or some sort of knowledge of the history, is a common trademark of all dynamical systems. And here it becomes extremely interesting, the notion of engineered awareness (external delayed feedback) versus the spontaneous awareness

page 347

August 2, 2021

348

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch09

V. Athanasiou and Z. Konkoli

(plasticity of the response) of the past: what distinguishes them? Are they different at all? We have taken these ideas one step further by specifically focusing on the agenda of achieving maximum intelligence from minimal systems.9–17 Extensive simulations of memristor networks have shown that even a single memristor, for relatively simple classification tasks, if operated properly, can match the performance of a deep neural network when applied on a relatively simple problem of ECG signal classification.16 For more complicated tasks, such as early sepsis detection, the same holds with a modest increase in the number of memristors.17 Of course, one could think of the law of conservation of computation, computation cannot come out of nothing, and cannot disappear into nothing, though the last part might not be strictly true, as anyone who shuts down the computer knows. Thus all this ought to be a “bluff”. Where is this, what feels like extra terrestrial intelligence, coming from then? 9.4. What is External Drive and How Does It Help Our studies have shown that a chunk of additional intelligence can be engineered in the form of an external drive signal. The idea is illustrated in Figure 9.2. Since dynamical systems naturally respond to external influences, engineering such a drive signal should not present a big engineering challenge. In the following, we summarize

Figure 9.2. Reservoir computer with a drive signal. The system is the same as the one shown in Figure 9.1 with a modification. Input: q(t) the time series data one wishes to analyze. Drive: u(t) is an auxiliary signal that is used to increase the intelligence of the system.

page 348

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computing with Modest Resources

b4205-v1-ch09

349

how one can enhance the computing power of simple dynamical systems by carefully preparing an external drive. One naturally wonders whether there might be other exciting possibilities of building small but powerful computing systems. In the context of this chapter, it is useful to think of an external drive as a communication channel that can be used to influence the dynamics of the system. This communication channel may have different physical manifestation, depending on the dynamical system of interest. It can be thermal radiation, sound, voltage, current, magnetic field, anything that can possibly interact with the system. The key idea is that this external drive signal can work in synergy with the system and the input of the computation. Thus, to be used this way, the dynamical system has to allow for two input channels. One will be used to deliver the information about the input, and the other will be used to enhance the computing abilities of the system. Explaining what the external drive is, is relatively simple. However, explaining how it works is much harder. The key purpose behind engineering a drive is to achieve a dynamical synergy between the input signal and the dynamics of the system. To understand what is meant by “advantageous synergy” one has to be familiar with the idea of the phase space separation, which is illustrated in Figure 9.3. The phase space of the system is a set of all possible configurations of the system. One can think of the state of the system as a point in that space. Over time, the point that represents the state of the system changes, and the trace of such points is a trajectory. The system has phase space separation property if under different inputs the trajectories of the system cluster in separate regions. This is important from the computation point of view since to read the output of the computation one should simply assess in which region the point representing the state resides. This is simple to do if one can associate distinct regions of the configuration space with distinct inputs, since the decision boundaries will be relatively simple geometrical objects. If the separation property holds, the engineering overhead necessary to implement the readout layer is minimal. One can say that there is a strong correlation between the input and the state

page 349

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch09

V. Athanasiou and Z. Konkoli

350 (a)

(b)

Figure 9.3. The phase space separation concept. The state of the system is a point in a D dimensional space. The set of all possible states is a configuration space. A trajectory is a set of points in the configuration space that the system traces when being exposed to an external input q. Assume that q1 and q2 exhibit some distinct features that can be recognized by the system. Panel (a): the phase space separation has not been achieved since it is impossible to infer whether the system has been exposed to q1 or q2 . Panel (b): the phase space separation has been achieved. The trajectory clusters in two separate regions of the configuration space.

of the system. This is a somewhat static picture, as information can be stored in the shape of the trajectory too, that is, time can be important too. The order in which the points have been visited could be exploited for computation, but that would require more complicated readout layers. In any case, the main reason behind engineering a drive signal is to enhance the phase separation property. 9.5. A Thought Experiment: Maxwell’s Daemon Rebooted In here we describe the key ideas without referring to mathematical formalities, with a particular aim of illustrating the issue of balances, “how much intelligence”, “where”, and “at which cost”. This will be done in a form of thought experiment, as follows.

page 350

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computing with Modest Resources

b4205-v1-ch09

351

We focus on a very simple dynamical system, to be referred to as Maxwell’s machine. Consider a gas consisting of colliding hard balls that move freely but change direction after elastically colliding with each other. Assume that there are N such balls in a container. The input to the computation is defined as follows. To hold all these particles in a box, one needs to apply external pressure, but it is possible that the particle are sensitive to other external influences. Each particle might be sensitive to an external magnetic field. Such external influences can be used as the input to the computation. The number of possibilities is virtually endless. For simplicity reasons, we just assume that we change the external pressure at will, and the numerical values of the applied pressure, for example, at equally spaced time instances, will serve as the input. Thus, the input to the system is a discrete series of pressure values. To define the output of the computation, assume that we can equip every device with a small information processing daemon, perhaps something like the famous Maxwell’s daemon. While the Maxwell’s creature was only envisioned to measure the positions of atoms and their velocities with the goal of breaking the laws of thermodynamics, we would like to “recruit” a daemon that prefers to do solid information processing, and has more options in conveying what it sees. Optionally, we could even take several such daemons, and position them inside the gas container, perhaps aiming to distribute them evenly, covering different regions of the gas. The information obtained from the daemons is used to produce the final output. How should we instruct such daemons to behave? This is, without shadow of a doubt, an extremely complicated algorithmic problem. Of course, the key question is what we wish to compute. Let us assume we want to be greedy and be able to compute everything (in the context of the time-series data pattern recognition). Let us illustrate some of the conceptual dilemmas. To begin with, let us assume utterly un-intelligent daemons. Every daemon has only a local perception of the world. It reports the velocities of the atoms in its vicinity, so all atoms that it tells about have roughly the same position, but the velocities could be different. The input to the device is a time-series representing the values of

page 351

August 2, 2021

352

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch09

V. Athanasiou and Z. Konkoli

the pressure applied at every time instance. The raw output of the computation would be a time-series of the velocities all daemons see. How much should one be able to compute with such a system? Maxwell’s machine 1 — Nonlinear dynamics without feedback (MM1): The system exhibits nonlinear dynamics, and according the common wisdom, such a system should be fairly intelligent. An instance of the elastic collision is a very “violent” nonlinear event. In fact, there is a formal mathematical result that encourages to believe that such a system should be able to compute anything in the limit when N becomes extremely large. But let us settle for something simple, a simple binary classification task in which the goal is to distinguish between stable pressure profiles versus varying ones. For the first type of the profile the device should output 1, for the second −1. How big should the N be so that we can achieve this? For example, can we achieve this with only one ball present in the volume? Probably not. Is it going to be easier with two balls? Maybe. Apparently, this is not an easy problem to solve. The main issue is that the daemons do not see patterns. Yet, the claim is that the intelligence is already there in the system, if N is big enough, and one ought to be able to extract in a simple way, but how? This example illustrates how hard it may be to use formal mathematical results in designing practical solutions. Maxwell’s machine 2 — Nonlinear dynamics with extended memory (MM2): Now, assume that we instruct daemons differently, that they have been allowed to memorize shorter sequences of the velocities they have seen in the past. It is reasonable to expect, though hard to verify without extensive study, that this should help in solving the problem. Namely, some information about the input could be encoded as correlations in the velocity sequence. Ultimately, assume that daemons do not have local vision, but an extended vision in space. Again, it is reasonable to expect that this would be the best option: then the number of balls in the system could likely be reduced. This example shows that the way we interact with the system is very important for achieving certain computation.

page 352

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computing with Modest Resources

b4205-v1-ch09

353

Maxwell’s machine 3 — Nonlinear dynamics with feedback (MM3): The interaction can happen at any level. One could engineer it like that. For example, there are intrinsic interactions between the balls. But one could arrange for extrinsic interactions, both at the input level, and the output level, and even both at the time. This last option leads to the most intelligent systems. For example, the most powerful form of interaction with the system would be to allow the daemons to memorize what they have experienced, and to further influence velocities of the balls they are observing. The thus constructed exhibits with delayed feedback behavior. One could harvest very complex information about spatio-temporal correlations in the particle dynamics for computation. Maxwell’s machine 4 — External drive (MM4): In our work we have explored another option. Assume that the system can be engineered so that the motion of the particles can be influenced by an auxiliary signal, for example, external magnetic field if all or some of the particles in the system are magnetic. This external drive signal could be used to enhance the intelligence of the system by achieving advantageous correlations between the computing goal and the systems dynamics. For example, a goal could be to achieve a situation in which all particles cluster in one corner of the volume when exposed to a signal from class 1, and in the other corner of the volume when exposed to a signal from class 2. For example, it is possible that a very small MM4 could achieve the same task as a rather big MM1, since part of the computing resources is moved into the external drive. Naturally, one needs to prepare that drive, but once that work has been carried out, the machine can operate autonomously. One might feel that MM4 should be easier to engineer than MM1-MM3. Interestingly, the information processing options represented by MM1-MM3 have been studied extensively in the literature and the last option, MM4, still needs to be investigated. In fact, in machine learning a similar technique exists and is referred to as “feature engineering”, where one provides extra input channels to the neural network. Yet, our focus is on generic dynamical systems, and we believe that the external drive concept is much deeper,

page 353

August 2, 2021

354

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch09

V. Athanasiou and Z. Konkoli

and way more generic. In the following we discuss an example how powerful the external drive idea can be. 9.6. Example: Memristor Networks In, here, we discuss a simple example that illustrates the ideas discussed above. At the dynamical system we use memristor networks. It is shown that instead of aiming for bigger systems, one can achieve a considerable amount of computation using only a simple system. 9.6.1. Memristor model A memristor is a nonlinear, passive, two-terminal component with a time-varying resistance often being referred to as the memristance R(t). The memristor element is suitable for temporal information processing since it exhibits the filter property: The memristance value at a specific time instance depends on the whole history of the applied voltage up to that time. A simple Pershin Di Ventra model18 is sufficient for the purposes of this chapter. The memristance R(t) changes depending on the voltage signal ΔV that is applied across the element according to a predefined law: R˙ = f (R, ΔV ). Here and in the following the dot over a symbol defines a time derivative R˙ = dR dt . If |ΔV | < Vthr , then the memristance changes as R˙ = αΔV and for |ΔV | > Vthr , R˙ = βΔV +k where k. The parameters α and β are device dependent and for a typical memristor it holds that α β. Due to the nature of the ionic transport in the material, the memristance is bounded between the lowest value Rmin and the maximum value Rmax . This implies that the laws just stated apply only if the resistance is within these boundaries. The parameter k is just a constant offset to make the response curve f (R, ΔV ) continuous in the variable ΔV . A memristor network is essentially a resistor network in which the usual resistances are replaced by memristances. Thus all the circuit analysis theory can be easily applied to simulate such models. This is useful since at each simulation step, assuming as usual that we are working with small time increments, one can first find the voltages

page 354

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computing with Modest Resources

b4205-v1-ch09

355

and currents. At a given time instance t one can compute the voltage differences ΔV . In the second step, these voltage differences are used to update the memristance values. Then the process can be repeated for every time instance. The natural input to the device can be in the form of an applied external voltage, or current. This would be a brute force design, but very effective nevertheless. Few applications of memristor networks for temporal information processing purposes can be found.9, 19 Another option, explored by us, is to assume that the parameters of the model can change in time. Thus the input to the device could consist of the time series data that describe how a parameter in model varies in time. In physics, such a parameter is often called an external field. For example, a very natural target for this type of application could be the parameter β, β(t) = a + bq(t) where q(t) is the actual input signal, and a and b are just conversion constants. As the external drive signal, we have followed a natural intuition and have chosen a simple voltage source. This voltage source is assumed to be emersed somewhere in the network. For a onememristor network this is the voltage applied at the memristor ports u(t). To illustrate the phase separation ideas, we have focused on a simple binary classification problem where the goal was to distinguish two classes of signals q(t): stable and varying. These are illustrated in Figure 9.4. The goal was to find a drive signal u such that the memristance values cluster in separate regions of the state space, which in this case is an interval [Rmin , Rmax ]. Figure 9.5 shows how the memristance behaves when exposed to a pair of signals from two different classes. The phase separation is manifest in the fact that the two graphs separate as the time goes on. This example shows that part of the systems intelligence can be embedded into the external drive. The decision process is indeed simple. One applies the external drive signal, the input signal, and waits a little. Then after some time, once the system has been allowed to do the computation (on its own), it is sufficient to check whether the memristance value is closer to Rmin or Rmax . We have shown that this approach can work remarkably well, even for signals that are not perfectly synchronized.

page 355

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch09

V. Athanasiou and Z. Konkoli

356

(a)

(b)

Figure 9.4. Illustration of the the training data for the two different classes. (a) The training data for class 0 (e.g., represented by the signal q1 ); (b) The training data for class 1 (e.g., represented by the signal q2 ).

Figure 9.5. Figure taken from Ref. [9]. The simulated outputs of the onememristor network under the optimum drive signal (from the genetic algorithm) and a two input signal from the classes 0 and 1.

We have demonstrated the idea on two realistic problems of ECG classification16 and sepsis prediction for patients in the intensive care unit (ICU).17 ECG signal classification: Using a single memristor, it is possible to classify ECG signal and discriminate whether a person has a heart problem or not. The models have been trained and tested on a labeled

page 356

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computing with Modest Resources

b4205-v1-ch09

357

data set of electrocardiogram (ECG) signals. A binary classification problem has been investigated. The strategy of using the single memristor for ECG signal classification is simple to describe: the memristance ought to be driven towards Rmin or Rmax depending on whether the input signal belongs to either of the two classes. If this can be achieved, classification can be performed by simply checking whether the average memristance value exceeds a predefined threshold. It has been shown that electrocardiogram signals can be classified as either healthy or diseased by a single memristor element with an optimized drive signal. Further, this was done with a relatively high accuracy. This system has been trained with few training examples (80 signals), has been tested with 1, 480 signals and still performed very well. One reason for this good performance could be the few number of training parameters; 10 parameters were used to train the drive signal and two parameters to train a feedback whenever it was used. It has been also shown that training feedback mechanisms or input layers significantly improved the performance of the device. Those options would be useful for software implementations since a few parameters need to be trained but would cost in energy resources in hardware implementations. The synergy between the drive signal and the input signal is important. This synergy has been fully exploited in the case of the ECG problem. In fact, the original downloaded ECG signals were all synchronized according to the QRS peak. The top of the QRS peak is a natural time reference for all ECG signals. However, we also tested the approach with non-synchronized signals. This is a harder problem, since the phase of ECG is unknown to the trained models. For this purpose, the signals have been randomly shifted in time so that they are all non-synchronized to each other. Thus, the models are tested on two different major cases, when the ECG signals are aligned and when the signals are asynchronous. The device performed relatively well, even when random phase shifts were employed. Of course, the device performance deteriorated when the synchronization was completely lacking.

page 357

August 2, 2021

358

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch09

V. Athanasiou and Z. Konkoli

Sepsis prediction: We have also tried to test these ideas against a much harder problem. In the first step a single memristor was used to predict, at an early stage, whether a patient ICU has sepsis or not. For this purpose, data sets of clinical variables from ICU patients have been downloaded from the 2019 Physionet Computing in Cardiology Challenge (https://physionet.org/content/challenge2019/1.0.0/).20 The data consist of 41 clinical variable values from 45,643 ICU patients traced over time, per hour basis. For a one memristor system, the results were not spectacular, which just showed that a single-memristor device has its limitation, obviously. However, when extending the system to include more memristors the results improved significantly. We have considered a simple branching structure where several memristors operate in parallel, each receiving a single input signal that consists of the weighted sum of the 41 clinical parameters. The results show that a single memristor with 79 trainable parameters as an input layer can achieve utility scores ≈0.3140 which correspond to approximately 80% true negatives and 58% true positives. This method is promising because it uses a small number of parameters but cannot be compared with other deep learning methods where thousands of parameters are trained. There is certainly under fitting to the training data. When more heterogeneous memristors are used, connected in parallel as explained above, the results improved significantly. Only a single trainable readout layer has been used which weighted the memristance values. The remarkable result was that the system with 500 memristors in parallel can achieve utility scores on the training data set comparable to other deep learning methods (≈0.43 utility score). This utility score was the maximum achieved on a separate test set in 2019 Physionet Computing in cardiology Challenge. This is a remarkable result because only 501 parameters have been trained (the main competitor, the deep neural networks, has much larger number of free parameters). A boosting algorithm has also been suggested in which one system can learn to correct the mistakes made by other systems. By using 20 systems consisting of 500 memristors each an even better overfitting has been obtained with utility score

page 358

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computing with Modest Resources

b4205-v1-ch09

359

0.59. Then, the next step would be to regulate overfitting and achieve a reasonable bias-variance trade-off. Since few parameters are trained in reservoir computing, it is expected that overfitting could be easily overcome, for example, by using a larger training data set. These methods are generic and can be applied at other prediction problems or be extended even for regression problems. These two examples illustrate how the approach to balance the intelligence of the system, and even carefully engineer it, can be an extremely powerful one, in terms of practical applications demonstrating, again, what happens when philosophy meets technology. By changing the way one thinks about the problem, and just by asking different types of questions, can lead to a remarkable performance, achieving the computing capacity of a deep neural network (the state of the art machine learning technique) with way less parameters.

9.7. Conclusions The research results outlined in this chapter argue for a somewhat surprising effect, that a relatively simple system can compute quite a lot if equipped with some sort of “external brain” that is simple to engineer: the external drive signal. What is remarkable about this approach is that the drive does not have to be super intelligent. It works more like a prompter in the theater that helps the actor remember lines. The drive does not “think”, the drive rather “whispers”. The beauty of the approach is that this external drive signal needs to be engineered only once, during the optimization phase of the device, and then simply used forever, for every possible input, even the one not yet seen by the device. But there is an additional profound message in the text. Every nonlinear dynamical system has the potential to do computation. In fact, there is a formal mathematical proof that if the system is complex enough a single dynamical system can perform universal computation. Yet, the mathematical proof is an idealization. In practice, it is hard to judge what “complex enough” really is. The research results outlined in this chapter argue for a somewhat surprising insight: the question of computing capacity

page 359

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

360

b4205-v1-ch09

V. Athanasiou and Z. Konkoli

cannot be understood in isolation. The intelligence is not only defined by the system per se, but also by the way we interact with it, and in terms of resources that are necessary to execute the interaction. The concept of “accidental computation” is not absolute as one might naively think. The computing ability does not exist on its own. It exists only if we can harvest it, and provided we can find ways to interact with the system in purposeful ways for information processing goals. Finding useful ways of interacting with dynamical systems is the key issue. The existence or absence of the computation, or the amount of computation the system can perform, is relative to the way one can exploit the intrinsic intelligence of the system, thus ultimately to the way one can interact with the system. This further implies that we should probably revise our way we think about computing capacity. It is not about how large the computing capacity of the device is, but rather the central question is what is the cost of extracting it. And this is where the philosophy (computing stone) meets engineering (design principles). The ultimate big question is how to measure these different balances between all the costs of realizing computation, and the benefits one obtains in terms of its raw computing power. A rather complex optimization problem for which we do not even know a fitness function. References 1. R. Landauer, Irreversibility and heat generation in the computing process. IBM J. Res. Develop. 5(3), 183–191 (1961). 2. C. H. Bennett, Universal computation and physical dynamics. Physica D: Nonlinear Phenomena 86(1), 268–273 (1995). 3. Z. Konkoli, On reservoir computing: From mathematical foundations to unconventional applications. In A. Adamatzky, (ed.), Advances in Unconventional Computing: Volume 1: Theory (Springer International Publishing, Cham, 2017), pp. 573–607. 4. B. J. MacLennan, Physical and formal aspects of computation: Exploiting physics for computation and exploiting computation for physical purposes. In A. Adamatzky (ed.), Advances in Unconventional Computing: Volume 1: Theory (Springer International Publishing, Cham, 2017), pp. 117–140. 5. K. Zoran, A perspective on putnam’s realizability theorem in the context of unconventional computation. Int. J. Unconvent. Comput. 11, 83–102 (2015). 6. H. Putnam, Representation and Reality (MIT Press, Cambridge, 1988).

page 360

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computing with Modest Resources

b4205-v1-ch09

361

7. International technology roadmap for semiconductors 2.0. Report (2015). URL http://www.itrs2.net/. 8. L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, Information processing using a single dynamical node as complex system. Nat. Commun. 2, 468/1–6 (2011). 9. V. Athanasiou and Z. Konkoli, On using reservoir computing for sensing applications: Exploring environment-sensitive memristor networks. Int. J. Parallel, Emergent Distrib. Syst. (2017). 10. Z. Konkoli, The sweet algorithm: Generic theory of using reservoir computing for sensing applications. Int. J. Parallel, Emergent Distrib. Syst. (2016). 11. C. Bennett, A. Jesorka, G. Wendin, and Z. Konkoli, On the inverse pattern recognition problem in the context of the time-series data processing with memristor networks. In A. Adamatzky (ed.), Advances in Unconventional Computation, Vol 2. Prototypes and algorithms (Springer, 2016). 12. E. Bergh and Z. Konkoli, On improving the expressive power of chemical computation. In A. Adamatzky (ed.), Advances in Unconventional Computation, Vol. 2. Prototypes and algorithms (Springer, 2016). 13. Z. Konkoli, On reservoir computing: From mathematical foundations to unconventional applications, In A. Adamatzky (ed.), Advances in Unconventional Computing, vol. 1. Theory (Springer, 2016). 14. V. Athanasiou and Z. Konkoli. On mathematics of universal computation with generic dynamical systems. (2020), pp. 385–405. 15. V. Athanasiou, K. K. Tadi, M. Hurevich, S. Yitzchaik, A. Jesorka, and Z. Konkoli, On sensing principles using temporally extended bar codes. IEEE Sensors J. 20(13), 6782–6791 (2020). 16. V. Athanasiou and Z. Konkoli, On improving the computing capacity of dynamical systems. Sci. Rep. 10(1), 9191 (2020). 17. V. Athanasiou and Z. Konkoli, Memristor models for early detection of sepsis in icu patients. In 2019 Computing in Cardiology (CinC), pp. 1–4. 18. M. D. Ventra, Y. V. Pershin, L. O. Chua, and F. IEEE, Circuit elements with memory: Memristors, memcapacitors, and meminductors. Proc. IEEE. 97(10), 1717–1724 (2009). 19. C. Du, F. Cai, M. A. Zidan, W. Ma, S. H. Lee, and W. D. Lu, Reservoir computing using dynamic memristors for temporal information processing. Nat. Commun. 8(1), 2204 (2017). 20. M. Reyna, C. Josef, R. Jeter, S. Shashikumar, M. Westover, S. Nemati, G. Clifford, and A. Sharma, Early prediction of sepsis from clinical data: The physionet/computing in cardiology challenge 2019. Crit. Care Med. 48, 210–217 (2020).

page 361

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch10

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 0010

Chapter 10

Physical Randomness Can Help in Computations Olga Kosheleva∗ and Vladik Kreinovich† University of Texas at El Paso 500 W. University El Paso, TX 79968, USA ∗ [email protected] † [email protected] Can we use some so-far-unused physical phenomena to compute something that usual computers cannot? Researchers have been proposing many schemes that may lead to such computations. These schemes use different physical phenomena ranging from quantum-related to gravityrelated to using hypothetical time machines. In this chapter, we show that, in principle, there is no need to look into state-of-the-art physics to develop such a scheme: computability beyond the usual computations naturally appears if we consider such a basic notion as randomness.

10.1. Introduction While traditional computers have achieved great results, there are still many important problems for which computations on current computers are too slow — not to mention that there are many problem for which it has been proven that they cannot be algorithmically solved on modern computers; see, e.g., Refs. [1–3]. To overcome this problem, many researchers and engineers are working on making computers faster — and to achieve that goal, they are figuring out if we can utilize additional physical processes. Several such schemes have been proposed, using different physical phenomena ranging from quantum effects to black holes to even hypothetical phenomena such as causal anomalies (time machines); see, e.g., Refs. [1, 2, 4–8]. 363

page 363

August 2, 2021

364

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch10

O. Kosheleva and V. Kreinovich

In this chapter, we show that there is yet another potential way to new computational abilities: namely, the use of physical randomness. 10.2. Randomness is Important In classical (pre-quantum) physics, randomness was mainly subjective. All the way to the beginning of the 20th century, all known laws of physics were deterministic. A good example is Newton’s laws. Once we know the initial positions and velocities of all the planets, we can predict — many years ahead — where the planets will be at any future moment. This is not just a theoretical possibility: astronomers did predict the planets’ positions with very high accuracy. This accuracy was indeed very high. For example, one of the motivations for developing General Relativity was that the predicted position of Mercury differed from the predicted one, with the difference growing by 43 arcseconds per 100 years. In that period, randomness was not in the physical theory itself, randomness was used — starting with Gauss — to describe the fact that we do not know the exact positions, the exact velocities, and in general, the exact values of other physical characteristics. In other words, in those days, randomness was mainly subjective, reflecting our lack of knowledge. The appearance of objective randomness. At the beginning of the 19 century, it became clear that, in addition to subjective randomness — that describes our ignorance — there is also objective randomness, randomness which is essential for the corresponding physical process itself. The first such phenomenon was radioactivity, when atoms emit high-energy particles. A radioactive material such as radium consists of numerous absolutely identical atoms. If the corresponding physics was deterministic, they would all emit the corresponding particles at exactly the same time — but this is not what we observe. What we observe is that at any given moment of time: • some atoms remain stable, while • other atoms emit the particles.

page 364

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Physical Randomness Can Help in Computations

b4205-v1-ch10

365

There is no regularity in which atoms are stable are which are not; in this sense, the corresponding process is truly random. According to modern physics, randomness is ubiquitous. Studies of radioactivity and related phenomena led to the development of new physics — quantum physics, according to which all reallife processes are probabilistic; see, e.g., Refs. [9, 10]. This means, in particular, that, in general: • it is not possible to predict future events, • it is only possible to predict the probabilities of different future events. When we repeat the same experiment with random results again and again, we get a random sequence. Randomness is also important to conciliate reversibility of physical laws with irreversibility of many physical processes. Later, it turned out that randomness is important already in the nonquantum approximation to the true physics: namely, it is important so that we will be able to reconcile physical equations with observations. Why do we need to reconcile them? Because all fundamental physical equations, starting with Newton’s laws, are reversible: if we keep all the particles where they are and reverse all the velocities, the system will reverse its trajectory and eventually reach the original state. This is easy to show in the mathematical level, this is easy to illustrate on the example of a simple mechanical system. However, in real life, many processes are irreversible. If we break a cup, there is no way to make the pieces come together into the original unbroken shape. This seeming contradiction was known already to Boltzmann, the father of statistical physics. The modern explanation for this seeming paradox is that, in addition to equations, we also need to take into account initial conditions. There may be some restrictions on these initial conditions, but within these restrictions, we do not expect any additional regularities. In other words, the initial conditions should be random.

page 365

August 2, 2021

366

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch10

O. Kosheleva and V. Kreinovich

This additional assumption breaks the original symmetry between past and future: when initial conditions are random, the resulting future state is not random at all: e.g., if we start with a random distribution of matter, we end up with the current hierarchical structure of galaxies, stars, etc.

10.3. How Can We Describe Randomness in Precise Terms? Need to go beyond the traditional probability theory. At first glance, it may appear that the question raised in the title of this section has been answered a long time ago. Starting with Gauss, we have used probabilities to describe randomness. This is true, probability theory has indeed been very successful in describing physical phenomena. However, as noticed already by Kolmogorov — the father of modern probability theory — this theory deals with ensembles, with mass phenomena. This is great, but physicists also deal with individual phenomena. For example: • If we measure the value of some quantity at consequent moments of time and get 010101..., clearly we have a regular process — namely, a periodic process. • On the other hand, if we flip a coin several times — or perform some quantum experiment with random outcomes — we get a sequence with no regularities, a sequence that, from the physical (and commonsense) viewpoint is random. In formalizing the difference between the two cases, traditional probability theory is of no help: according to this theory, there is no difference between the sequence 0101... and any other sequence — all these sequences have the same probability 2−n , where n is the number of bits. To describe this difference, Kolmogorov and other researchers came up with a special notions of Kolmogorov complexity and algorithmic randomness; see, e.g., Ref. [11]. Let us describe these notions.

page 366

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Physical Randomness Can Help in Computations

b4205-v1-ch10

367

Kolmorogov complexity and algorithmic randomness. What is the difference between a regular sequence like 0101... and a truly random sequence? • A regular sequence like 01010..., no matter how long, can be generated by a very short and simple program: it is sufficient to run a loop repeatedly producing 01. • On the other hand, the very fact that a binary sequence is random means that it has no regularities, no such short program is possible, and the only way to general a random sequence 0110... of length n is to write something like print(0110..). This program requires that we reproduce the whole sequence, so its length (= number of bits) is close to n. In other words: • a regular sequence x of length len(x) = n can be generated by a short program, a program whose length is much shorter than n; • on other hand, the only way to generate a random binary sequence x of length n is to have a program whose length is close to n. To formalize this idea, researchers came up with the following notion of Kolmogorov complexity K(x) of a given string x. We fix some universal programming language, and we define K(x) as the length of the shortest program p that generates x: def

K(x) = min{len(p) : p generates x}. According to the above arguments: • for a regular sequence, K(x) len(x); • on the other hand, for a truly random sequence x, we have K(x) ≈ len(x). This leads to the following formal definition11, 12 : Definition 1. Let C be an integer. We say that a binary sequence x is C-random if K(x) ≥ len(x) − C.

page 367

August 2, 2021

368

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch10

O. Kosheleva and V. Kreinovich

10.4. Back to Physics Back to physics. So, whenever we want to describe that some physical sequence is random, we formalize it as saying that this sequence is C-random for some appropriate value C (and, of course, for an appropriate selection of a programming language). One more thing about physics. Physicists usually believe that: • if some phenomenon is never happening in the world, • then there must be a reason for this never-happening. This sounds natural. Interestingly, we can reverse this statement and formulate a logically equivalent statement which may sound not as natural — that: • if some phenomenon does not contradict any physical laws, • this means that eventually, we will observe this phenomenon. From this viewpoint, if the only restriction on a binary sequence (obtained from observing a physically random phenomenon) is that this sequence should be random (in the sense to Definition 1), this means that every sequence which is random according to this definition will eventually occur as a result of some observation. 10.5. How Does This Affect Computations If a sequence is not random, we will eventually find it out. Let us first show that if a binary sequence x is not C-random, then we will eventually find it out. Indeed, if the sequence x is not C-random, this means that its Kolmogorov complexity K(x) is smaller than len(x) − C. By definition of the Kolmogorov complexity, this means that there exists a program p of length len(p) < len(x) − C that generates the sequence x. There are finitely many sequences of any given length. Thus, there are finitely many programs p of length len(p) < len(x) − C. So, all we have to do is to start all these programs. If a sequence x is not

page 368

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Physical Randomness Can Help in Computations

b4205-v1-ch10

369

truly random, one of these programs will eventually generate x — and thus, we will know that this sequence x is not C-random. There is a physical way to check whether a given binary sequence is C-random. Let us show that this enables us to decide, for each binary sequence x, whether this sequence is C-random or not. To do that, we simultaneously start two processes: • in the first process (as described in the previous subsection), we run all possible programs p of length len(p) < len(x) − C and wait until one of these programs generates the given sequence x; • in the second process, we perform all possible measurements of random phenomena and wait until in one of these measurements, we get exactly the given sequence x. As a result: • If a sequence if not C-random, then one of the programs from the first process will generate x and thus, we will know that the given sequence x is not C-random. • On other hand, if the sequence x is C-random, then, according to the arguments from the previous section, it will eventually appear in some observations — so we will know that this sequence x is C-random. So, by using physical world, we will always be able to tell whether a given sequence is C-random or not. This may sound natural and simple, but without the use of physical world, we cannot check C-randomness. Interestingly, the above simple scheme enable us to solve the problem which is not algorithmically solvable on a normal computer! Indeed, here is a simple proof. Proposition 1. No algorithm is possible that, given a binary sequence x, checks whether this sequence is C-random. Proof. This proof uses a known auxiliary result that, for each C > 0 and for each n, there exists a sequence x of length n for which K(x) ≥ len(x) − C.

page 369

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch10

O. Kosheleva and V. Kreinovich

370

Indeed, by definition of Kolmogorov complexity, each binary sequence which is not C-random, i.e., for which K(x) < len(x) − C = n − C, can be generated by a program of length < n − C – i.e., equivalently, of length ≤ n − C − 1. There are: • 21 = 2 binary strings of length 1, • 22 = 4 strong of length 2, etc., and • 2n−C−1 strings of length n − C − 1. Overall, there are 21 + 22 + 23 + · · · + 2n−C−1 = 2n−C − 2 < 2n−C possible binary strings of length ≤ n − C − 1. Thus, there are less than 2n−C programs of length ≤ n − C − 1. • Each program generates only one output, so all these programs can generate ≤ 2n−C different strings. • On the other hand, there are 2n > 2n−C strings of length n. Thus, some strings of length n are C-random. On the set of all binary sequences of length n there is a natural order — corresponding to the numerical values of these sequences interpreted as binary numbers, so that: • 00...0 is 0, • 00...01 is 1, etc., all the way to • 11...1 which is 2n − 1. By using this order, among several possible C-random sequences of length n, we can select one sequence rn which corresponds to the binary sequence with the smallest possible number. Now, we are ready to prove the proposition by contradiction. Let us assume that there exists a program P (of some length len(P )) that: • given a binary sequence x, • checks whether K(x) ≥ len(x) − C.

page 370

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Physical Randomness Can Help in Computations

b4205-v1-ch10

page 371

371

Then, we can compute rn as follows: • We enumerate all the numbers 0, 1, 2, . . . , and for each number, we use the program P to check whether this number is C-random. • Once we reach the C-random number, we stop and output this number — this will be the desired number rn . The resulting program R for computing rn consists of: • the program P and • a few lines describing adding 1. So, the length of the program R does not depend on n. Hence, for large enough n, this length is < n − C — but it contradicts the definition of rn as one of the C-random sequences — i.e., by definition, sequences which cannot be generated by any sequence whose length is smaller than n − C. This contradiction shows that our assumption was wrong, and thus, indeed, no algorithm is possible that: • given a binary sequence x, • checks whether this sequence is C-random. The proposition is proven.

A word of caution. In principle, the above scheme allows us, by using the physical world, to solve the general problem that cannot be algorithmically solved only by existing computers. But is it practical? Not really — or, in more optimistic way — not yet. Indeed, the above procedure requires that, for each sequence x of length n, we perform experiments until we encounter this sequence (or until we find a short program generating this sequence). There are 2n possible sequences of length n. Most of them — as one can easily prove — are C-random. Thus to get a given C-random sequence, we need between 1 and ≈2n measurements. So, to get the given sequence, on average, we will need to have ≈0.5 · 2n measurements. Thus, this procedure requires exponential time — and is, therefore, not feasible; see Refs. [2, 3].

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

372

b4205-v1-ch10

O. Kosheleva and V. Kreinovich

A word of hope. The above procedure is not practical yet. However, the fact that such a procedure exists in the first place makes us hope that a feasible (or at least more feasible) version of this procedure will be eventually found.

Acknowledgments This work was supported in part by the US National Science Foundation grants 1623190 (A Model of Change for Preparing a New Generation for Professional Practice in Computer Science) and HRD1242122 (Cyber-ShARE Center of Excellence).

References 1. O. Kosheleva and V. Kreinovich, Space-time assumptions behind NP-hardness of propositional satisfiability. Mathematical Structures and Modelling 29, 13–30 (2014). 2. V. Kreinovich, A. Lakeyev, J. Rohn, and P. Kahl, Computational Complexity and Feasibility of Data Processing and Interval Computations (Kluwer, Dordrecht, 1998). 3. C. Papadimitriou, Computational Complexity (Addison-Wesley, Reading, Massachusetts, 1994). 4. S. J. Aaronson, NP-complete problems and physical reality, ACM SIGACT News 36(1), 30–52 (2005). 5. M. Koshelev and V. Kreinovich, Towards computers of generation Omega — non-equilibrium thermodynamics, granularity, and acausal processes: A brief survey. Proceedings of the International Conference on Intelligent Systems and Semiotics ISAS’97 (National Institute of Standards and Technology Publ., Gaithersburg, MD, 1997), pp. 383–388. 6. O. Kosheleva and V. Kreinovich, What can physics give to constructive mathematics. In Mathematical Logic and Mathematical Linguistics (Kalinin, Kalinin University, 1981), pp. 117–128 (in Russian). 7. V. Kreinovich, Designing, understanding, and analyzing unconventional computation: The important role of logic and constructive mathematics. Appl. Math. Sci. 6(13), 629–644 (2012). 8. D. Morgenstein and V. Kreinovich, Which algorithms are feasible and which are not depends on the geometry of space-time. Geombinatorics 4(3), 80–97 (1995). 9. R. Feynman, R. Leighton, and M. Sands, The Feynman Lectures on Physics (Addison Wesley, Boston, Massachusetts, 2005).

page 372

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Physical Randomness Can Help in Computations

b4205-v1-ch10

373

10. K. S. Thorne and R. D. Blandford, Modern Classical Physics: Optics, Fluids, Plasmas, Elasticity, Relativity, and Statistical Physics (Princeton University Press, Princeton, New Jersey, 2017). 11. M. Li and P. Vitanyi, An Introduction to Kolmogorov Complexity and Its Applications (Springer, New York, 2008). 12. L. A. Levin, Randomness conservation inequalities: Information and independence in mathematical theories. Inform. Control 61, 15–37 (1984).

page 373

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch11

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 0011

Chapter 11

Probabilistic Logic Gate in Asynchronous Game of Life with Critical Property Yukio-Pegio Gunji∗ , Yoshihiko Ohzawa and Terutaka Tanaka Department of Intermedia, Art and Science, School of Fundamental Science and Technology, Waseda University, 3-4-1, Ohkubo, Shinjuku, Tokyo 169-8555, Japan ∗ [email protected] Metaheuristic and self-organizing criticality (SOC) could contribute to robust computation under perturbed environments. Implementing a logic gate in a computing system in a critical state is one of the intriguing ways to study the role of metaheuristics and SOCs. Here, we study the behavior of cellular automaton, game of life (GL), in asynchronous updating and implement probabilistic logic gates by using asynchronous GL. We find that asynchronous GL shows a phase transition, that the density of the state of one decays with the power law at the critical point, and that systems at the critical point have the most computability in asynchronous GL. We implement AND and OR gates in asynchronous GL with criticality, which shows good performance. Since tuning perturbations play an essential role in operating logic gates, our study reveals the interference between manipulation and perturbation in probabilistic logic gates.

11.1. Introduction As we confront a large amount of data, it is difficult to obtain a complete dataset, resulting in an incomplete and ill-defined problem. Therefore, metaheuristics are investigated more intensely than optimization.1 The first direction of metaheuristics is bioinspired computing based on swarm intelligence, such as the krill herd algorithm,2

375

page 375

August 2, 2021

376

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch11

Y.-P. Gunji, Y. Ohzawa and T. Tanaka

monarch butterfly optimization,3 ant colony algorithm,4 and others.5 These algorithms can search for quasi-optimistic solutions by balancing the global goal of a group with internal perturbations. The second direction is the development of biological computing, in which computing is implemented by living biological and/or chemical material, such as BZ pattern computing,6 physarum computing,7–15 fungal computing16, 17 and soldier crab computing.18, 19 Since a living system maintains a wholeness as a living unity and confronts perturbed surroundings, it is obliged to balance the global goal as a system with surrounding perturbations.1, 20, 21 Thus, living computing is expected to solve an ill-defined problem and to obtain a quasi-optimal solution. Since dissipative structures such as the BZ pattern are surrounded by a specific boundary condition, they also contain a wholeness to some extent. In that sense, they are similar to quasi-biological computing. The third direction is the reevaluation of self-organizing criticality (SOC), in which a critical state characterized by a power law distribution is automatically obtained by balancing the global optimum with perturbation.22–25 Although there is a disadvantage for SOC that requires a fitness function (or potential function) that is difficult to find, the idea of criticality could contribute to the attribute of a quasi-optimal solution under perturbed conditions. Since it has been found that biological systems usually reveal the features of critical phenomena, they are expected to be systems showing SOC (e.g., Levi walk; Refs. [26–31]). It leads to the idea that the results of bioinspired computing are destined to be quasi-optimal solutions if they show critical phenomena characterized by the power law. With respect to the power law distribution, the first, second and third directions of metaheuristics can be combined. This leads to the powerful idea that instead of implementing SOCs by using a specific fitness, bioinspired computing can play a more important role in generating criticality, which can reveal quasi-optimal solutions. Quasi-optimal solutions for an ill-defined problem could be generalized by universal and efficient computational ability under perturbed natural conditions.32 For this reason, universal and efficient computing devices can be made of biological materials, such as physarum gates, solider crab gates and some physarum computers.

page 376

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch11

Probabilistic Logic Gate in Asynchronous Game of Life

377

The next question arises:33, 34 what is the essential property in bioinspired or biological computing? Bioinspired computing or biological computing is expressed as a multi-agent system whether it is a multicellular or unicellular organism under indefinite conditions.1, 20, 21 Therefore, these systems are obliged to be probabilistic and/or asynchronous in updating due to indefinite and perturbed environments. Conversely, bioinspired and biological computing is robust due to the probabilistic and/or asynchronous property. Thus, it is interesting to study the relationship among quasi-optimal solutions, critical properties, and probabilistic and/or asynchronous updating in a computational system.32, 35–37 This leads to the possibility of probabilistic logic gates implemented by asynchronously updating computations. It is important to determine the core in bioinspired computing that is robust under perturbed conditions. To answer those questions, we take a special cellular automaton called game of life (GL).38–41 Originally, game of life was updated in a synchronous fashion and could be used as a logic gate by using a specific pattern called gliders and glider guns.39, 41 Since these patterns are generated in a synchronous fashion, their behaviors can be strictly controlled by initial and boundary conditions. Although the trajectories of gliders going straight can be predicted, they can be easily broken by small perturbations. The question of how we could construct robust logic gates made of gliders arises. If possible, one can implement robust probabilistic logic gates rather than unstable deterministic logic gates. Thus, we aim to determine the core in bioinspired computing that is robust under perturbed conditions. 11.2. Asynchronous Game of Life and its Logic Gates 11.2.1. Phase transition and criticality GL is a 2D cellular automaton in which the state of a cell is either 0 or 1.38–41 The state of a cell at the (i, j) site at the tth step is represented by ati,j . To compare asynchronous GL with synchronous GL, we introduce the virtual state of a cell at the (i, j) site at the tth step, which is represented by bti,j . The transition rule of the state of a cell is defined as follows.

page 377

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch11

Y.-P. Gunji, Y. Ohzawa and T. Tanaka

378

In the case of ati,j = 0: t = 3, then bt+1 = 1; Si,j i,j

If

Otherwise bt+1 i,j = 0,

(11.1)

In the case of ati,j = 1: t = 5 or 6, then bt+1 = 1; Si,j i,j

If

Otherwise bt+1 i,j = 0,

(11.2)

where t = Si,j

m=+1,n=+1 m=−1,n=−1

ati+m,j+n − ati,j .

(11.3)

The transition of synchronous GL is defined by t+1 at+1 i,j = bi,j .

(11.4)

In contrast, the transition of asynchronous GL is defined by using probability p with 0 ≤ p ≤ 1 (e.g., Refs. [42–44]) such that t at+1 i,j = ai,j

with p;

t+1 at+1 i,j = bi,j

with 1 − p.

(11.5)

It is easy to see that there are both active and inactive (stationary) phases in asynchronous GL and that a cell in the stationary phase is not obliged to follow transition rules (1)–(3). If p = 0, then asynchronous GL coincides with synchronous GL. If p = 1, then all cells are always at stationary phase and preserve given initial states over time. Figure 11.1 shows the snapshots of the pattern generated by asynchronous GL controlled by the probability p in Equation (11.5), where the system size is 50 × 50. It is easy to see that under small p, the states converge to frozen states that oscillate with period 2 at most, while under large p, the states chaotically change. One can see the phase transition of the generated pattern with respect to p.

page 378

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Probabilistic Logic Gate in Asynchronous Game of Life

b4205-v1-ch11

379

Figure 11.1. Snapshots of asynchronous GL controlled by the probability p. States 1 and 0 are represented by black squares or blanks.

Figure 11.2 shows the phase transition from ordered patterns to chaotic patterns with respect to the probability of the stationary phase. Patterns are generated in 500×500 squared cells with periodic boundaries. It is determined whether all cells converge to oscillatory states with a period equal to or smaller than 2 within 10,000 steps in each trial. If true, it is counted as a frozen state. In 100 trials, the normalized number of frozen states is interpreted as the probability of the frozen state. Each circle represents the probability of the frozen state plotted against the probability of the stationary phase, p. To estimate the position of the critical state, the range between 0.095 and 0.155 was intensively investigated. The red curve represents a sigmoidal curve fitted to data, such as y=

1 , 1 + ea(x−b)

(11.6)

where a = 115.038 and b = 0.1269. The critical value is approximately 0.13. Once the probability of the frozen state drops to 0, it is not recovered. If p is larger than 0.2, the states of asynchronous GL are perpetually changed in time and space. The next question arises, does the middle point between two phases, order (oscillation) and chaos, indicate the critical state. Asynchronous cellular automata showing cluster patterns or class IV

page 379

August 2, 2021

380

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch11

Y.-P. Gunji, Y. Ohzawa and T. Tanaka

Figure 11.2. The probability of the frozen state plotted against p (the probability of the stationary phase). The red curve represents the approximation by the sigmoidal curve.

patterns frequently reveal a power law distribution with a density of 1 plotted against time. Additionally, in asynchronous GL, the density against time is evaluated. Figure 11.3 shows the density of the state of 1 against time in a log–log plot. The initial condition is set for the density of states of 1 at 0.590. The system size is 150 × 150 cells and p = 0.13. The density against time is averaged over 20 trials and is represented by the green curve in Figure 11.3. The data are approximated by a power-law function such as y = a(x − b)c ,

(11.7)

where a = 0.020 and b = 0.081, and the exponent of the power is −0.1595. The exponent −0.1595 could imply special meaning since this value can be found in various contexts.45–47 The model of directed

page 380

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Probabilistic Logic Gate in Asynchronous Game of Life

b4205-v1-ch11

381

Figure 11.3. The density of 1 plotted against time in a log–log scale. Data averaged over 20 trials are represented by green curves, and the approximated line (power law) is represented by a purple line.

percolation is defined by a squared 2D system consisting of N × N cells. The state of a cell is either 0 or 1, where 1 means rocky material and 0 means a pore. The number of pores divided by N 2 represents the porosity. Initially, water is given in cells at the top layer. Vertically water falls through pores. If the water reaches at least one cell at the bottom layer, percolation succeeds. The normalized success rate of percolation is obtained for multiple trials, where the distribution of pores is randomly distributed for each trial. Directed percolation shows the phase transition between 100% success and 0% success of percolation. It is numerically obtained that the critical value of the porosity is 0.705 and that the density of water at the critical point is decayed in a power law fashion whose exponent is −0.1595. Some rules of elementary cellular automata updated asynchronously also show the phase transition with respect to the probability of stationary phase.47 At the critical point, the density of the state of 1 decays in a power law fashion whose exponent

page 381

August 2, 2021

382

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch11

Y.-P. Gunji, Y. Ohzawa and T. Tanaka

is −0.1595. If asynchronous updating is defined by the order of updating, where a map of the order of cells to the order of updating is a bijection, and if the order of updating can influence a transition rule, many elementary cellular automata show cluster-like or class IV-like patterns mixing the localized periodic patterns with chaotic patterns. Most of them show the power law decay of the density of the state of 1 with an exponent of −0.1595.36, 37 Therefore, this exponent, −0.1595, might show universal properties at the critical point in the phase transition. 11.2.2. Computation by asynchronous GL Synchronous GL can be used as a computing resource since mesoscopic patterns called gliders can be used as a transmission of the value. As shown in the left diagram of Figure 11.4, if a special configuration called a glider gun is prepared, a glider that goes straight while vibrating is perpetually generated. Indeed, if a glider is controlled, whether it is active (i.e., gliders can be generated) or inactive (i.e., gliders cannot be generated), one can implement the AND gate and NOT gate. The right diagram of Figure 11.4 shows the implementation of the AND gate by gliders. A dummy glider gun represented by C, hidden

Figure 11.4. Glider and AND gate made of gliders in synchronous GL. For input gliders A and B set at the solid squares, output A AND B (represented by A&B) is set at the white square.

page 382

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Probabilistic Logic Gate in Asynchronous Game of Life

b4205-v1-ch11

383

in the gate device, can constantly generate a glider and is prepared at the location in which input gliders from A and B can be collided by the dummy glider from C. If two gliders collide with each other, the gliders disappear. If firing a glider represents a value of 1, and no glider represents a value of 0, one can prepare the A AND B gate, as shown in the right diagram of Figure 11.4. This implementation reveals that if the input glider guns and the location of the output are stably controlled, synchronous GL has the computation ability. However, the gliders are unstable against perturbation. Figure 11.5 shows the behavior of gliders under the perturbation when the state of a cell is inverted with a probability of 0.0001 (i.e., 0 is replaced by 1 and vice versa), where there is no perturbation before t = 200. Once a perturbation occurs, the glider never goes straight and is collapsed anywhere. Instead of normal gliders, broken gliders are generated and propagate radially. Even if there is no perturbation, gliders are not stable from the initial condition, where 0 and 1 are randomly distributed. Gliders can be generated very rarely from the random initial condition, and they are collapsed by fragile broken gliders propagating irregularly. Figure 11.6 shows the general case soon after the random initial

Figure 11.5. Gliders under the perturbation of the inversion of 0 and 1, with a probability of 0.0001.

page 383

August 2, 2021

384

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch11

Y.-P. Gunji, Y. Ohzawa and T. Tanaka

Figure 11.6. Rare gliders from the initial condition where 0 and 1 are randomly distributed. Gliders are surrounded by red squares.

condition, in which only two gliders are generated. After this situation, local configurations fall into the oscillatory state called flickers. Under no perturbation, only strict and rigid control of gliders can implement logic gates in synchronous GL. In other words, strict and rigid computations such as synchronous GL cannot contribute to real-world computation, which always confronts perturbed conditions. Although theoretical computer scientists focus on the implementation of universal Turing machines or complete machines, one should think about robust but not highly efficient or universal computation. The next question arises: how can probabilistic computation be implemented against perturbed conditions? Here, we implement a probabilistic logic gate made of broken gliders generated in asynchronous GL. First, we estimate how many gliders are generated in the range of phase transitions of asynchronous GL. Figure 11.7 shows the relationship between the occurrence of gliders and the probability of the stationary phase in asynchronous GL. It is easy to see that the occurrence of gliders is increased in the range surrounding the critical value, p ∼ 0.13. Far from the critical value, few gliders are generated. This implies that an asynchronous GL with critical properties can be used as a probabilistic logic gate

page 384

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Probabilistic Logic Gate in Asynchronous Game of Life

b4205-v1-ch11

385

Figure 11.7. Occurrence of gliders in asynchronous GL. In each diagram, gliders within 100 steps from the initial condition overlap.

p

Figure 11.8. Probability of glider per cell plotted against the probability of the stationary phase. The peak is found close to the critical point.

since many gliders that break and can be irregularly propagated are generated around the critical value. Figure 11.8 shows the normalized occurrence of gliders plotted against the probability of the stationary phase. The number of gliders per time step is normalized by the system size, which is regarded as the probability of gliders per cells.

page 385

August 2, 2021

386

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch11

Y.-P. Gunji, Y. Ohzawa and T. Tanaka

The curve represents the approximation by polynomial functions such as y = −3.1924x2 + 2.8931x3 − 0.9137x2 + 0.1109x + 0.0034. 11.2.3. Logic gate in asynchronous GL The probabilistic logic gate in asynchronous GL is implemented as follows. The square space consisting of N 2 cells is divided into four areas. The first area S(A) is for input value A, the second area S(B) is for input B, the third area S(C) is for output value, and the fourth area is the rest of the N 2 cells. These areas are defined by N N , (11.8) S(A) = (i, j) i = 1, 2, . . . , ; j = 1, 2, . . . , 3 3 2N 2N , + 1, . . . , N ; S(B) = (i, j) i = 3 3 2N 2N , + 1, . . . , N , (11.9) j= 3 3 N 2N N + 1, + 2, .., − 1: S(C) = (i, j) i = 3 3 3 N 2N N + 1, + 2, .., −1 . (11.10) j= 3 3 3 In the AND gate, C is indicated by A AND B (sometimes represented by A&B), and in the OR gate, C is indicated by A OR B (sometimes represented by AB). In the probabilistic logic gate, the value 1 is defined by a density of 1 that exceeds the specific probability, P1 . When a value of 1 is given for input X with X = A, B, each cell in S(X) has the state of 1 with the probability exceeding P1 . A value of 0 for input X is defined by the situation in which all cells in S(X) have a state of 0. After setting input values, asynchronous GL with the probability of stationary phase p = 0.13 (critical value) is adapted to all cells in T time steps under the condition with perturbation in which state of a cell is inverted with the probability pNOISE . If the density of S(C) exceeds P1 at T , then output C is 1; otherwise, C is 0. Note that although the definition of input 0 is consistent with that of output 0, they are different from each other.

page 386

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Probabilistic Logic Gate in Asynchronous Game of Life

b4205-v1-ch11

387

Figure 11.9. Typical time development of AND gate in asynchronous GL. The white arrow represents time development. The inner squares A, B, and A&B represent S(A), S(B), and S(A&B), respectively.

Figure 11.9 shows the computational process of the AND gate in asynchronous GL. Here, we define T = 1500, N = 100, P1 = 0.1 and pNOISE = 0.0001. This process implies that A = 1 and B = 1, thus A&B = 1. The top left diagram represents an initial condition, where A = 1 and B = 1 are set with a density of 1 with a probability of 0.5, which exceeds P1 . After that, asynchronous GL is adapted to all cells, with the boundary conditions of which are that all cells have state 0. As time proceeds, broken gliders generated in S(A) and S(B) are propagated to the outside of S(A) and S(B), and finally (t = 1500 = T ), the density of 1 for cells in S(A&B) exceeds P1 . This implies A&B = 1. Figure 11.10 shows typical results of the AND gate in asynchronous GL. All parameters are set to be the same as those in Figure 11.9. Since the probabilistic logic gate shows the truth table with a specific probability, this result is just one typical example. Each pair of patterns connected by a blank arrow represents a pair of initial conditions and a final pattern at t = T . This implies that if A = 0 and B = 0, then A&B = 0 (top left in Figure 11.10); if A = 0 and B = 1, then A&B = 0 (top right in Figure 11.10); if A = 1 and B = 0, then A&B = 0 (bottom left in Figure 11.10); and if A = 1

page 387

August 2, 2021

388

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch11

Y.-P. Gunji, Y. Ohzawa and T. Tanaka

Figure 11.10. Computational results of the probabilistic AND gate implemented by asynchronous GL with perturbation, pNOISE = 0.0001, P1 = 0.1 and p = 0.13.

and B = 1, then A&B = 1 (bottom right in Figure 11.10). Thus, the results in Figure 11.10 are consistent with the truth table of the Boolean AND gate. The performance of this AND gate is discussed later. While the implementation of the OR gate is the same as that of the AND gate, if the perturbation is different from the condition of the AND gate, the behaviors of the logic gate are totally changed. If pNOISE of the AND gate, 0.0001, is replaced by 0.001, the OR gate can be constructed. We abbreviate A OR B as AB. Since broken gliders can be grown with a ten times larger perturbation (pNOISE = 0.001), a pair of input 1 and input 0 can generate much more broken gliders that can reach the output area, S(AB), than the situation with pNOISE = 0.0001. Thus, the probability of cells in S(AB) with state 1 can exceed P1 , which implies that the output is 1. In contrast, in the initial condition, A = 0 and B = 0, since there is no broken glider, but instead a cell with the state of 1 surrounded by cells with the state of 0, such isolated cells with the state of 1 cannot be grown to the broken gliders that can propagate to S(AB). That is why the result implies that output is 0.

page 388

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Probabilistic Logic Gate in Asynchronous Game of Life

b4205-v1-ch11

389

Figure 11.11. Computational results of the probabilistic OR gate implemented by asynchronous GL with perturbation, pNOISE = 0.001, P1 = 0.1 and p = 0.13.

Figure 11.11 shows the computational results of the OR-gate implemented by asynchronous GL with the probability of stationary phase p = 0.13. The input and output conditions are the same as those in the AND gate, although only the perturbation is different, such as pNOISE = 0.001. As shown in Figure 11.11, A = 0 and B = 0 lead to AB = 0; A = 0 and B = 1 lead to AB = 1; A = 1 and B = 0 lead to AB = 1; and A = 1 and B = 1 lead to AB = 1. Due to a large perturbation, broken gliders can reach S(AB). Cells with the state of 1 initially set as an input of 1 are first decreased and then are grown and propagated radially as if they had exploded. Thus, broken gliders are propagated anywhere, and the cells in state 1 could be grown in S(AB). Now, we evaluate the performance of the AND gate and OR gate implemented by asynchronous GL. The conditions of the two gates are the same as those of the gates in Figures 11.10 and 11.11. For each input condition of a pair of A and B, it is determined whether the number of cells with the state of 1 in S(A&B) or S(AB) exceeds P1 (i.e., output = 1) or not (output = 0) when t = T . The number of trials is 100 for each determinant, and the probability of

page 389

August 2, 2021

390

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch11

Y.-P. Gunji, Y. Ohzawa and T. Tanaka

output 1 is obtained as the number of instances of output 1 divided by the number of trials. Figure 11.12 top left, four histograms show the frequency distribution of the cover of S(A&B). They show histograms for the conditions of A = 0 and B = 0, A = 0 and B = 1, A = 1 and B = 0, and A = 1 and B = 1. The horizontal axis represents the number of cells in S(A&B) whose state is 1 at T (i.e., the cover of S(A&B)), where the actual number is obtained as the number multiplied by 20 (i.e., the cover of 3 represents 60). The vertical axis represents the frequency corresponding to the cover. The red line in the histogram represents P1 multiplied by the number of cells in S(A&B), which implies that the cover on the right-hand side of the red line leads to an output of 1 and that on the left-hand side leads to an output of 0. Therefore, the summation of the frequencies on the right-hand side of the red line divided by the number of trials (100) is the probability of the output of 1. This results in the probability of the truth table corresponding to the AND gate, as shown in the top right of Figure 11.12. The probability of the truth table is expressed as a histogram whose horizontal axis represents a pair of inputs (00 represents A = 0 and B = 0; other abbreviations of numbers are similar), and the vertical axis represents the probability of the output of 1 corresponding to the input pair. The histogram shows that the probability of the output of 1 exceeds 0.5 only for the input of 11. This result is consistent with the truth table of the Boolean AND gate. The bottom left part of Figure 11.12 shows the frequency distribution of the cover of the output area, S(AB), for inputs of 00, 01, 10, and 11 in the OR gate implemented by asynchronous GL. The red line is the same as that in the histogram for the AND gate. It is easy to see that most trials for inputs 01, 10, and 11 exceed the red line, indicating that P1 is the threshold value for the output of 1. This leads to the histogram for the probability of the truth table corresponding to the OR gate, as shown in the bottom right of Figure 11.12. It shows that the probability of the truth table of the OR gate is consistent with that of the Boolean OR gate. Finally, we can conclude that both AND and OR gates can be implemented by asynchronous GL at the critical point.

page 390

August 2, 2021 16:19 Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Probabilistic Logic Gate in Asynchronous Game of Life 391

b4205-v1-ch11

Figure 11.12. Frequency distribution of the cover of the output area (S(A&B) or S(AB)) for each input pair (left four histograms) and the probability of the output of 1 against the input pair (right histogram). The top corresponds to the AND gate, and the bottom corresponds to the OR gate implemented by asynchronous GL.

page 391

August 2, 2021

392

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch11

Y.-P. Gunji, Y. Ohzawa and T. Tanaka

11.3. Discussion In Section 11.1, we discuss the metaheuristic approach based on bioinspired computing, unconventional computing and self-organized criticality. It is known that those approaches can obtain quasioptimal solutions for an ill-defined incomplete solution and that they frequently show critical properties characterized by the power law distribution. How can one manipulate a search method to obtain a quasi-optimal solution against perturbed conditions? This question is immediately related to the essential core of biological and/or unconventional computing. Since there is no global clock in a biological system, interactions between components in a biological system are destined to be asynchronous. Although asynchronous computing is ubiquitous in natural computing systems, it has not been studied much, since its behaviors are different from synchronous computing, which has been extensively studied and is far from controllable. That is why implementing logic gates in asynchronous computing could be a touchstone to manipulating biological and/or unconventional computing. In this chapter, we take 2D cellular automaton, GL, and study the behavior of GL in asynchronous updating. Although GL in synchronous updating can generate controllable gliders that can be used to implement logic gates, GL in asynchronous updating cannot generate controllable gliders. Although gliders generated in asynchronous GL are also unstable and rapidly collapse into fragments, they are regenerated due to the perturbation, and broken gliders can be propagated. Thus, those probabilistic gliders can be used to implement probabilistic logic gates. First, we find that asynchronous GL shows a clear phase transition between the phase of oscillatory states and the phase of chaotic states. The parameter controlling phase is the probability of the stationary phase in asynchronous updating. In particular, we find that the critical point shows power law decay with an exponent of −0.1595. Since this exponent is found in various critical phenomena, it can be considered a universal property.

page 392

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Probabilistic Logic Gate in Asynchronous Game of Life

b4205-v1-ch11

393

Second, we estimate the relationship between the computability and the criticality in asynchronous GL. Since logic gates in synchronous GL are constructed using gliders, the computability is evaluated with respect to the number of gliders generated. We find that most gliders can be generated in the region close to the critical point in the asynchronous GL space parameterized by the probability of the stationary phase. This suggests that critical phenomena have autonomous computability and/or self-organizing reachability to quasi-optimal solutions. It is well known that most biological systems show a power law distribution with various properties and that living biological systems are adapted to given environments. Based on those results, it can be said that systems showing critical properties have the ability to compute fitness in environments, which can make systems themselves able to adapt to the environments. Finally, we construct the probabilistic logic gates implemented by asynchronous GL with a critical point. Since the generation and trajectories of gliders cannot be controlled in the logic gates, manipulating gliders is destined to be probabilistic. Manipulating logical values for logic gates is much more probabilistic than manipulating logical values in chemical computations such as BZ reaction computers. However, one can construct both AND gates and OR gates in asynchronous GLs. One of the most intriguing aspects of probabilistic logic gates implemented by asynchronous GL is tuning noise in manipulating logic gates. While the essential mechanism and most of the conditions of the OR gate are the same as those of the AND gate, only the level of perturbation can make the difference between the AND gate and the OR gate. In other words, increasing perturbation can replace the AND gate with the OR gate. This result be an essential property of living systems, since the AND gate outputs the value 1 only for restricted input pair 1 and 1, while the OR gate outputs the value 1 for various input pairs, 1 and 0, 0 and 1, and 1 and 1. If the value 1 is interpreted as the active state of organisms, replacing the AND gate with the OR gate implies that replacing the active condition with the restricted conditions or the active condition with various conditions. In other words, it suggests that organisms utilize various resources if

page 393

August 2, 2021

394

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch11

Y.-P. Gunji, Y. Ohzawa and T. Tanaka

the environmental condition worsens (i.e., more perturbation). In this sense, the change from the AND gate to the OR gate by increasing perturbation is more biological than computational. If one can utilize the change from the AND gate to the OR gate corresponding to the change in the situation, a robust self-organizing computing could be implemented. Since the performance of the logic gate is probabilistic, distributive laws such as A&(BC) = (A&B)(A&C) cannot always hold. It implies that the distributive law is weak. This could be related to the orthomodular lattice corresponding to quantum logic.48

11.4. Conclusion Derived from the idea of metaheuristic and/or self-organizing criticality, we investigate cellular automaton as a system carrying the essential core of biological and/or unconventional computing. Since the possibility of metaheuristics can be expressed as manipulating computations and the possibility of self-organizing criticality can be expressed as a power-law distribution, we investigate the behavior of an asynchronous GL and estimate whether logic gates can be implemented in asynchronous GL. We find that asynchronous GL shows a phase transition and that the GL at the critical point could have the highest computability with respect to the ability to generate gliders. Finally, we implement probabilistic AND gates and OR gates in asynchronous GL at the critical point, which can perform as logic gates to some extent. In our logic gates, perturbation plays an essential role in computing. Ten times multiplied perturbation makes the AND gate become the OR gate under the same conditions. Our study can be a touchstone to spell out the novel significance of biological and/or unconventional computing by showing the role of noise tuned to operate logic gates.

Acknowledgements This work was financially supported by JSPS 18K18478, and JPJS 00120351748.

page 394

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Probabilistic Logic Gate in Asynchronous Game of Life

b4205-v1-ch11

395

References 1. A. Erskine and J. M. Hermann, CriPS: Critical particle swarm optimization. Proc. Eur. Conf. Art. Life 207–214 (2014). 2. G.-G. Wang, L. Guo, A. H. Gandomi, G.-S. Hao, and H. Wang, Chaotic Krill Herd algorithm. Information Science 274, 17–34 (2014). 3. S. Chakrabarty, A. K. Pal, N. Dey, D. Das, and S. Acharjee, Foliage area computation using Monarch butterfly algorithm. In 1st International Conference on Non-conventional Energy (ICONCE), (IEEE, 2014), pp. 249– 253. 4. M. Dorigo and T. St¨ utzle, Ant Colony, Ant Colony Optimization (MIT Press, 2004). 5. G.-G. Wang, S. Deb, and L. D. S. Coelho, Earthworm optimization algorithm: A bio-inspired metaheuristic algorithm for global optimization problems. Int. J. Bio-inspired Comput. 12(1), 1 (2018). doi:10.1504/IJBIC.2018.093328. 6. R. Toth, C. Stone, A. Adamatzky, B. L. Costello, and L. Bull, Experimental validation of binary collisions between wave fragments in the photosensitive Belousov–Zhabotinsky reaction. Chaos, Solitons & Fractals 14(4), 1605–1615 (2009). 7. T. Nakagaki, H. Yamada, and A. Toth, Maze-solving by an amoeboid organism. Nature 407, 470 (2000). 8. A. Takamatsu, R. Tanaka, H. Yamada, T. Nakagaki, T. Fujii, and I. Endo, Spatiotemporal symmetry in rings of coupled biological oscillators of Physarum plasmoidal slime mold. Phys. Rev. Lett. 99, 068104 (2001). 9. A. Tero, S. Takagi, T. Saigusa, K. Ito, D. P. Bebber, M. D. Fricker, K. Yumiki, R. Kobayashi, and T. Nakagaki, Rules for biologically inspired adaptive network design. Science 327, 439–442 (2010). doi:10.1126/science.1177894. 10. S. Tsuda, M. Aono, and Y.-P. Gunji, Robust and emergent Physarumcomputing. BioSystems 73, 45–55 (2004). 11. Y.-P Gunji, T. Shirakawa, T. Niizato, and T. Haruna, Minimal model of a cell connecting amoebic motion and adaptive transport networks. J. Theor. Biol. 253, 659–667 (2008). 12. Y.-P. Gunji, T. Shirakawa, T. Niizato, M. Yamachiyo, and I. Tani, An adaptive and robust biological network based on the vacant-particle transportation model. J. Theor. Biol. 272, 187–200 (2011). 13. A. Adamatzky, Physarum Machines: Computers from Slime Mold (World Scientific, 2010). 14. M. Aono, S. Kasai, S.-J. Kim, M. Wakabayashi, H. Miwa, and M. Naruse, Amoeba-inspired nanoarchitectonic computing implemented using electrical Brownian ratchets. Nanotechnology 26, 234001 (2015). doi:10.1088/09574484/26/23/234001. 15. L. Zhu, S.-J. Kim, M. Hara, and M. Aono, Remarkable problem-solving ability of unicellular amoeboid organism and its mechanism. Roy. Soc. Open Sci. 5, 180396 (2018). http://dx.doi.org/10.1098/rsos.180396. 16. A. Adamatzky, On spiking behaviour of oyster fungi pleurotus djamor. Sci. Rep. 8, 1–7 (2018).

page 395

August 2, 2021

396

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch11

Y.-P. Gunji, Y. Ohzawa and T. Tanaka

17. A. Adamatzky, M. Tegelaar, H. A. Wosten, A. L. Powell, A. E. Beasley, and R. Mayne, On boolean gates in fungal colony. Biosystems 104138 (2020). 18. Y.-P. Gunji, Y. Nishiyama, and A. Adamatzky, Robust soldier crab ball gate. Complex Syst. 20, 94–104 (2014). 19. C. G. Cordero, Parameter adaptation and criticality in particle swarm optimization. arXiv:1705.06966x1[cs.NE], 19 May 2017. 20. Y. P. Gunji, M. Murakami, T. Kawai, M. Minoura, and S. Shinohara, Universal criticality beyond the trade-off in swarm models resulting from the data-hypothesis interaction in Bayesian inference. Comput. Struct. Biotechnol. 19, 247–260 (2021). 21. P. Bak, C. Tang, and K. Wiesnfeld, Self-organized criticality: An explanation of 1/f noise. Phys. Rev. Lett. 59, 381–384 (1987). 22. P. Bak and C. Tang, Earthquakes as a self-organized critical phenomenon. J. Geol. Res. 94, 15635–15637 (1989). 23. S. A. Kauffman and S. Johnsen, Coevolution to the edge of chaos: Coupled fitness landscapes, poised states, and coevolutionary avalanches. J. Theor. Biol. 149(4), 467–505 (1991). 24. P. Bak and K. Sneppen, Punctuated equilibrium and criticality in a simple model of evolution. Phys. Rev. Lett. 71, 4083–4086 (1993). 25. A. M. Reynolds, Current status and future directions of Lévy walk research. Biology Open 7, bio030106 (2018). 26. G. M. Viswanathan, S. V. Suldyrev, S. Havlin, M. G. E. de Luz, E. P. Raposo, and H. Eugene Stanley, Optimizing the success of random searches. Nature 401, 911–914 (1999). 27. F. Bartumeus, Lévy processes in animal movement: An evolutionary hypothesis. Fractals 15, 151–162 (2007). 28. D. W. Sims et al., Scaling law of marine predator search behaviour. Nature 451, 1098–1102 (2008). 29. N. E. Humphries et al., Environmental context explains Lévy and Brownian movement patterns of marine predator. Nature 465, 1066–1069 (2010). 30. A. M. Reynolds, P. Schultheiss, and K. Cheng, Are Lévy flight patterns derived from the Weber–Fechner law in distance estimation? Behav. Ecol. Sociobiol. (2013). doi:10.1007/s00265-013-1549-y 31. Y. P. Gunji and D. Uragami, Breaking of the trade-off principle between computational universality and efficiency by asynchronous updating. Entropy 22(6), 1049 (2020). 32. M. Conrad, Adaptability (Plenum Publishing Corp, New York, 1983). 33. M. Conrad, On design principle for a molecular computer. Commun. ACM 28(5), 464–480 (1985). 34. Y. Gunji, Pigment color patterns of molluscs as autonomy, generated by asynchronous automata. Biosystem 23, 317–334 (1990). 35. Y. P. Gunji, Self-organized criticality in asynchronously tuned elementary cellular automata. Complex. Syst. 23, 55–69 (2014). 36. Y. P. Gunji, Extended self-organized criticality in asynchronously tuned cellular automata. In B. Vasileios (ed.), Chaos, Information Processing and Paradoxical Games (World Scientific, Singapore, 2014),

page 396

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Probabilistic Logic Gate in Asynchronous Game of Life

b4205-v1-ch11

397

37. M. Gardner, The fantastic combinations of John Conways’s New Solitaire Game ‘Life’. Scientific American 223, 120–123 (1970). 38. E. R. Berlekamp, J. H. Conway, and R. Guy, Winning Ways for Your Mathematical Plays, vol. 2 (Academic Press, 1982). 39. J. Nordfalk and P. AlstrØm, Phase transitions near the “game of life”, Phys. Rev. E 54(2), R1025–1028 (1996). 40. J.-P. Rennard, Implementation of logical functions in the Game of Life. In: A. Adamatzky (ed.), Collision-Based Computing, (Springer, 2002), pp. 491–512. 41. N. Fatès and M. Morvan, An experimental study of robustness to asynchronism for elementary cellular automata. Complex. Syst. 16, 1–27 (2001). ´ Thierry, M. Morvan, and N. Schabanel, Fully asynchronous 42. N. Fatès, E. behavior of double-quiescent elementary cellular automata. Theor. Comp. Phys. 362, 1–16 (2006). 43. N. Fatès, A guided tour of asynchronous cellular automata. J. Cell. Autom. 9, 387–416 (2014). 44. E. Domany and W. Kinzel, Equivalence of cellular automata to Ising models and direcgted percolation. Phys. Rev. Lett. 53, 311–314 (1984). 45. H. Hinrichsen, Non-equilibrium critical phenomena and phase transitions into absorbing states. Adv. Phys. 49(4), 815–958 (2000). 46. N. Fatès and M. Morvan, An experimental study of robustness to asynchronism for elementary cellular automata. Complex Systems 16, 1–27 (2005). 47. Y. P. Gunji, K. Nakamura, M. Minoura, and A. Adamatzky, Three types of logical structure resulting from the trilemma of free will, determinism and locality. BioSystems (2020). doi: information:10.1016/j.biosystems. 2020.104151. 48. T. Nakagaki, M. Iima, T. Ueda, Y. Nishiura, T. Saigusa, A. Tero, R. Kobayashi, and K. Showalter, Minimum-risk path finding by an adaptive amoebal network. Phys. Rev. Lett. 99, 068104 (2007). doi:10.1103/ PhysRevLett.99.068104. 49. Y. Nishiyama, Y.-P. Gunji, and A. Adamatzky, Collision-based computing implemented by soldier swarms. Int. J. Parallel, Emergent Distrib. Syst. (2012). doi:10.1080/17445760.2012.662682. 50. Y. P. Gunji, S. Shinohara, T. Haruna, and V. Basios, Inverse Bayesian inference as a key of consciousness featuring a macroscopic quantum logic structure. BioSystems 152, 44–63 (2017).

page 397

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch12

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 0012

Chapter 12

A Mystery of Human Biological Development — Can It Be Used to Speed up Computations? Olga Kosheleva∗ and Vladik Kreinovich† University of Texas at El Paso 500 W. University El Paso, Texas 79968, USA ∗ [email protected] † [email protected] For many practical problems, the only known algorithms for solving them require non-feasible exponential time. To make computations feasible, we need an exponential speedup. A reasonable way to look for such possible speedup is to search for real-life phenomena where such a speedup can be observed. A natural place to look for such a speedup is to analyze the biological activities of human beings — since we, after all, solve many complex problems that even modern super-fast computers have trouble solving. Up to now, this search was not successful — for example, there are people who compute much faster than others, but it turns out that their speedup is linear, not exponential. In this chapter, we want to attract the reader’s attention to the fact that recently, an exponential speed up was indeed found — namely, it turns out that the biological development of humans is, on average, exponentially faster than the biological development of such smart animals as dogs. We hope that unveiling the processes behind this unexpected speedup can help us achieve a similar speedup in computations.

12.1. Formulation of the Problem Many real-life problems probably require un-feasible exponential computation time. In most practical problems it is feasible

399

page 399

August 2, 2021

400

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch12

O. Kosheleva and V. Kreinovich

to check whether a given candidate for a solution is a solution or not. For example: • If we are given what supposed to be a detailed proof of a mathematical statement, then it is possible to check, step by step, that this proof is indeed correct. • If we are given a dependence that all measurement results are supposed to satisfy, then it is easy to check, observation by observation, that this dependence is indeed satisfied. • If we are given a design of a plane, then it is possible to simulate its reaction to different weather conditions and thus, check that this design satisfies all the given specifications. The class of all the problems in which we can feasibly check whether a given candidate is indeed a solution is known as NP. Some of the problems from this class can be solved in feasible time. The class of all such feasibly solvable problems is denoted by P (see, e.g., Refs. [1,2]). It is still an open problem whether P contains all the problems from the class NP, that is, whether P = NP. Most computer scientists believe that these two classes are different. What is known is that in the class NP, there are problems which are harder than all others — in the sense that every other problem from the class NP can be feasibly reduced to this problem. Such problems are known as NP-complete. If P and NP are different — as most computer scientists believe — then for each NP-complete problem, it is not possible to have a feasible algorithm for solving all its instances. Many practical problems have been proven to be NP-complete. Thus, for these problems, we cannot have a general feasible algorithm. Most probably, this means that these problems require exponential time like 2n , where n is the size of the input. Such algorithms are not practically feasible — indeed, already for n ≈ 300, the exponential time exceeds the lifetime of the Universe. How can we solve NP-complete problems faster? Since we cannot feasibly solve NP-complete problems on usual computers by using known algorithms and known physical processes, a natural idea

page 400

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch12

A Mystery of Human Biological Development

401

is to look for new algorithms and/or new physical processes that would hopefully allow us to solve these problems faster. How can we find such algorithms and processes? One way is to look for real-life phenomena in which some instances are exponentially faster than others. Faster does not mean exponentially faster. Of course, there are many cases when some processes are faster. A natural place to look is us humans, since we are actually solving many complex problems. Our abilities to solve problems are different, so it is reasonable to look for people who can solve problems faster. For example, there exist people who can perform arithmetic computations much faster than others (see, e.g., Refs. [3–9]). However, it turns out that the corresponding speed-up factor is the same for all the problems,10–12 irrespective of the input size n. So, even if we learn the algorithm which is (subconsciously) used by these people, we will only decrease the computation time by a constant factor, but in general, exponential time remains exponential. Similarly, a recent research13 has shown that there is, on average, a constant difference between the times when men and women reach the same stage of biological development — which means that the corresponding phenomena also cannot be used for an exponential speedup. What we do in this chapter. In this chapter, we show that there is a phenomenon with an observed exponential speed-up — a phenomenon related to human biological development. Thus, there is a chance that, by studying this phenomenon, we will be able to achieve a similar exponential speed-up in our computations — and thus, be able to solve NP-complete problems in feasible time. 12.2. Exponential Speedup Phenomenon: A Brief History and a Brief Description A newly observed phenomenon. The story starts with attempts to compare biological development of different species, for example, by comparing biological development of humans with biological

page 401

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch12

O. Kosheleva and V. Kreinovich

402

development of very intelligent animals such as dogs. In the first approximation, this relation is described by a known linear formula: that a human age h corresponding to the same biological development stage as the dog’s age d is approximately equal to h ≈ 7d. This formula is known to be approximate. For example, right after birth, pups become independent more quickly than this formula — while human babies remain helpless for much longer relative time. A recent research14 analyzed a large amount of data on biological development of humans and dogs, and based on this data, came up with the following formula that provide a more accurate match between the times needed to reach the same biological development stages: h = 16 · ln(d) + 31. This is indeed an exponential speedup. What this logarithmic formula means is what takes exponential time d = c · 2n to develop in a dog will take linear time 16 · ln(a · 2n ) = (16 · ln(2)) · n + (16 · ln(a) + 31) in a human! So, this is indeed an — unexpected — example of an exponential speedup. This is a very new result. At this moment, it is not clear where this speedup comes from — especially since until this result, no one noticed any fundamental difference between humans and higher animals in terms of simple (non-brain-related) biological development. Hopefully, a detailed analysis of the situation will reveal some mechanisms that will turn out to be useful to speed up computations as well. Acknowledgments This work was supported in part by the US National Science Foundation grants 1623190 (A Model of Change for Preparing a New Generation for Professional Practice in Computer Science) and HRD-1242122 (Cyber-ShARE Center of Excellence).

page 402

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

A Mystery of Human Biological Development

b4205-v1-ch12

403

References 1. V. Kreinovich, A. Lakeyev, J. Rohn, and P. Kahl, Computational Complexity and Feasibility of Data Processing and Interval Computations (Kluwer, Dordrecht, 1998). 2. C. H. Papadimitriou, Computational Complexity (Addison Wesley, 1994). 3. P. Beckmann, A History of π (Barnes & Noble, New York, 1991). 4. M. d’Ocagne, Le Calcul Simplifié par les Procédés Mecaniques et graphiques (Gauthier–Villars, Paris, 1905). 5. M. d’Ocagne, Le Calcul Simplifié: Graphical and Mechanical Methods for Simplifying Calculation (MIT Press, Cambridge, Massachusetts, 1986). 6. D. R. Hofstadter, Godel, Escher, Bach: an Eternal Golden Braid (Basic Books, 1999). 7. A. R. Luria, The Mind of a Mnemonist: a Little Book About a Vast Memory (Hardard University Press, Cambridge, Massachusetts, 1987). 8. O. Stepanov, http://stepanov.lk.net/ 9. R. Tocquet, The Magic of Numbers (Fawcett Publications, Robbinsdale, Minnesota, 1965). 10. O. Kosheleva, Can we learn algorithms from people who compute fast: an indirect analysis in the presence of fuzzy descriptions. In Proceedings of the 2009 World Congress of the International Fuzzy Systems Association IFSA’2009, Lisbon, Portugal, July 20–24, 2009, pp. 1394–1397. 11. O. Kosheleva and V. Kreinovich, Can we learn algorithms from people who compute fast, In R. Seising and V. Sanz (eds.), Soft Computing in Humanities and Social Sciences (Springer Verlag, Heidelberg, 2011), pp. 267–275. 12. O. Kosheleva and K. Villaverde, How Interval and Fuzzy Techniques Can Improve Teaching (Springer Verlag, 2018). 13. M. S. Goyal, T. M. Blazey, Y. Su, L. E. Couture, T. J. Durbin, R. J. Bateman, T. L. S. Benzinger, J. C. Morris, M. E. Raichle, and A. G. Vlassenko, Persistent metabolic youth in the aging female brain. Proc. US Nat. Acad. Sci. (2019). doi 10.1073/pnas.1815917116. 14. T. Wang, J. Ma, A. N. Hogan, S. Fong, K. Licon, B. Tsui, J. F. Kreisberg, P. D. Adams, A.-R. Carvunis, D. L. Bannasch, E. A. Ostrander, and T. Ideker, Quantitative translation of dog-to-human aging by conserved remodeling of epigenetic networks. Cell Syst. (2020), 11(2): 176–185.e6.

page 403

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 0013

Chapter 13

Symmetric Automata and Computations Mark Burgin Department of Computer Science University of California, Los Angeles, 520 Portola Plaza, Los Angeles, CA 90095, USA The theory of algorithms, automata and computation is aimed at modeling and exploration of computers, cell phones, computer networks, such as the Internet, and processes in them. Traditional models of these information processing systems reflect only transformation of data. These transformations are controlled by programs, which stay unchanged during the whole computational process. However, in physical networks and computers, their programs are also changing by special software tools such as interpreters, compilers, optimizers and translators. That is why to further develop traditional models of computations, here we introduce and study a more complete model of computations. It is called a symmetric automaton or symmetric machine because it performs symmetric computations transforming not only data but also programs, which control data transformation. Relations between symmetric automata and conventional automata are studied. It is demonstrated that in many cases, symmetric automata allow improving efficiency and decreasing complexity of computations.

13.1. Introduction After the Turing machine became the foremost model of computation, many researchers tried to build automata or algorithms, which could be more powerful than Turing machines. In such a way, multihead, multitape Turing machines, Turing machines with ndimensional tapes, nondeterministic, probabilistic Turing machines, Kolmogorov algorithms, random access machines (RAM), parallel 405

page 405

August 2, 2021

406

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

random access machines (PRAM) and some other models have been elaborated. However, it was always proved that any of these automata could not compute more functions than the simplest Turing machine with a 1D tape. This situation brought forth the Church–Turing Thesis, which asserted that Turing machines were the most powerful class of algorithms. For a long time, computer scientists believed in this thesis because until the 1980s, all invented automata satisfied the statement of the Church–Turing Thesis. However, several researchers challenged this Thesis. For instance, in his talk at the International Congress of mathematicians in 1958, Kleene formulated a conjecture that it might be possible that algorithms that change their programs while computing would be more powerful than Turing machines.1 The first theoretical model of algorithms that change their programs while computing was reflexive the Turing machine.2 However, it was proved that the class of reflexive Turing machines is computationally equivalent to the class of Turing machines, that is, these machines have the same computing power. In such a way, Kleene’s conjecture was disproved but at the same time, it was demonstrated that reflexive Turing machines can be essentially more efficient than Turing machines. Namely, a relevant reflexive Turing machine can effectively outperform any Turing machine that computes the same function.2, 3 This direction of computer science was further developed when Schroeder introduced a new model of computation — symmetric Turing machines or S-machines to reflect important properties of computers.4, 5 To understand this innovation, it is necessary to compare the new structure with conventional Turing machines. In a conventional Turing machine T , the head (processor) performs operations with data in the memory (tape) using a fixed system of instructions — the program of T . In contrast to this, in a symmetric Turing machine, information processing goes not only from the head to the memory but also backward. On the one hand, the head (processor) performs operations with data in the memory of an S-machine using instructions from the program. On the other hand, the memory of an S-machine performs operations with instructions from the head (processor).

page 406

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Symmetric Automata and Computations

b4205-v1-ch13

407

Physical computers also perform operations with their programs using special software tools such as interpreters, compilers, optimizers and translators. There are also program optimizers, which improve characteristics of other programs by transforming them. It means that symmetric Turing machines introduced by Schroeder theoretically model physical computers working in the recursive mode. These features of computers working in the inductive mode were theoretically represented in the concept of symmetric inductive Turing machine, which was introduced in Ref. [6]. It is necessary to remark that the term symmetric Turing machine was also used in Ref. [7] with a different meaning. To discern symmetric Turing machines introduced by Schroeder and symmetric Turing machines introduced by Lewis and Papadimitriou, we call the former operationally symmetric Turing machines and the latter instructionally symmetric Turing machines. Comparing operationally symmetric Turing machines and reflexive Turing machines as a tool for decreasing complexity of information processing,2, 3 we see that the latter form an important special case of operationally symmetric Turing machines. In this chapter, we extend the concepts of machine symmetry and reflexivity going from symmetric Turing machines to general symmetric automata and their important special case — symmetric instruction machines, which include as particular instances operationally symmetric Turing machines, operationally symmetric finite automata, operationally symmetric inductive Turing machines, and operationally symmetric inductive instruction machines. Besides, we construct a mathematical formalization of the concept of an operationally symmetric Turing machine. This is a necessary step because descriptions from Refs. [4, 5, 8] only explain what operationally symmetric Turing machines are doing while treating these machines as algorithms demanding exact specification of how they perform their operations. This mathematical specification of operationally symmetric Turing machines brings us to three basic types of operationally symmetric Turing machines. The obtained mathematical models allow us to study properties of these machines explicating relations between different types of operationally symmetric Turing

page 407

August 2, 2021

408

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

machines and conventional Turing machines. In addition, it is demonstrated that in many cases, symmetric automata allow improving efficiency and decreasing complexity of computations. In Section 13.5 Conclusion, we summarize obtained results and suggest directions for the future research. 13.2. Structure of Abstract Automata and Instruction Machines Similar to physical computers, abstract automata have three components: hardware, software, and infware.9 Definition 2.1. Infware of an abstract automaton consists of the data processed by this automaton. In the process of the computer science development, specific abstract automata have been created for processing different kinds of infware. As a rule, abstract automata (computing devices) work with finite words (i.e., 1D structures), which form linear languages. Consequently, their infware consists of finite words in a finite alphabet. Turing machines with 2D tapes and 2D cellular automata operate finite 2D arrays of symbols. It means that their infware consists of finite 2D arrays of words. Turing machines with n-dimensional tapes and n-dimensional cellular automata work with finite n-dimensional arrays of symbols. It means that their infware consists of finite n-dimensional arrays of symbols. Some automata, such as Kolmogorov algorithms and storage modification machines, process arbitrary finite graphs.10, 11 Consequently, their infware consists of finite graphs. There are also many practical algorithms that work with finite graphs (cf., e.g. Ref. [12]). Consequently, their infware also consists of finite graphs. Structural machines work with arbitrary structures.13, 14 Consequently, their infware consists of arbitrary structures.

page 408

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Symmetric Automata and Computations

b4205-v1-ch13

409

Definition 2.2. Hardware of an abstract automaton consists of theoretical devices, such as a control device, processor or memory, which provide means for computation by this automaton. In a general case, a physical information processing system has three basic parts: the input system, output system and processor, which consists of one or several processing (e.g., computing) units. Consequently, to correctly model physical information processing systems, which are practically used by people, the hardware of an abstract automaton needs three basic parts: (abstract) input device, (abstract) information processor, and (abstract) output device.9 In many models, input and output devices either are not specified or are represented by components of the common memory and/or of the processor. For instance, in a conventional Turing machine, operations of input and output, if any, are performed using the working memory — tape. Inductive Turing machines have special input and output registers, for example, tapes. The same is true for pointer machines. Indeed, a pointer machine receives input — bounded sequences of symbols (words) from its “read-only tape” (or an equivalent storage device) — and it writes output sequences of symbols on an output “write-only” tape (or an equivalent storage device). Neural networks also contain all these basic parts: the input device that comprises all input neurons, output device that consists of all output neurons, and it is possible to treat all neurons of the network or only all hidden neurons as its information processor.15 Usually input and output neurons are called visible neurons. Definition 2.3. Software of an abstract automaton consists of texts, which control functioning of this automaton. For instance, many kinds of algorithms and abstract automata, such as finite automata, pushdown automata, register machines, Kolmogorov algorithms, random access machines (RAM), and Turing machines, use systems of instructions, for example, in the form of transition rules, to control computational processes. These systems of instructions (rules) constitute software of the corresponding automata and machines.

page 409

August 2, 2021

410

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

The system of weights, activation functions, threshold functions and output functions form software of neural networks. It is possible to treat these systems as algorithms although their form is different from traditional algorithms, which are described as systems of instructions. However, the software of the majority of algorithms and abstract automata consists of systems of instructions or rules. These instructions (rules) determine computational processes performed by automata. Systems of such instructions (rules) are called algorithms. All classes of abstract automata, which utilize such algorithms, are unified by the comprehensive concept of an instruction machine. Definition 2.4. (a) An instruction machine, or instruction automaton, M is an automaton functioning of which is determined by a system of instructions (rules). (b) A pure instruction machine, or pure instruction automaton, M is an automaton functioning of which is determined only by a system of instructions (rules) and its input. Note that functioning of an instruction automaton is not necessarily uniquely determined by its system of instructions. For instance, its functioning can also depend on the states of its control device as in Turing machines. Besides, in nondeterministic instruction machines, for example, in nondeterministic Turing machines, there are metarules of instruction selection, which can essentially influence machine functioning.16, 17 At the same time, an important dynamic characteristic of the majority of abstract automata is their state. This brings us to another important class of automata. Definition 2.5. (a) A state machine or state automaton M has a control device and is an automaton functioning of which is determined by the states of its control device. (b) A pure state machine or pure state automaton M has a control device and is an automaton functioning of which is determined only by the states of its control device and its input.

page 410

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Symmetric Automata and Computations

b4205-v1-ch13

411

Note that the control device of an automaton can coincide with the whole automaton. In this case, the functioning of the automaton is determined by its states. However, when the automaton (machine) has a control device, it makes this automaton (machine) more flexible. Often state machines (state automata) are also instruction machines (automata) with systems of instructions. However, implicitly any state machine (automaton) M is an instruction machine (automaton). Indeed, if we take descriptions of how the functioning of the machine (automaton) M depends on the state, we obtain instructions (rules) of its functioning. We observe this situation in the case of finite automata. A finite automaton is a pure state machine (automaton) but its transition function (relation) makes it also an instruction machine (automaton) because the transition function is described by transition rules. Let us consider the structure of a regular state instruction machine (state instruction automaton). In a general case, a state instruction machine (state instruction automaton) M has three components: • The control device CM represents states of the machine (automaton) M • The memory WM stores data • The processor PM transforms (processes) information (data) from the input and the memory WM In many models of computation, the control device is a finite automaton. However, in neural Turing machines, the control device is a neural network.18, 19 Often the memory WM consists of cells and connections between them. Each cell can be empty or contain a symbol from the alphabet AM of the machine (automaton) M . However, there are automata in which cells from the memory contain words and even texts in some alphabet. For instance, symmetric Turing machines studied in the next section have such memory. On each step of computation, the processor PM observes one cell from the memory WM at a time, and can change the symbol in this

page 411

August 2, 2021

412

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

cell and go to another cell using connections in the memory WM . These operations are performed according to the instructions RM for the processor PM . These instructions RM can be stored in the processor PM or in the memory WM . However, it is possible that an instruction machine consists of a single processor as its particular case — a finite automaton. Example 2.1. A finite automaton G is an instruction machine, which has the following representation. Namely, a finite automaton (FA) G consists of three structures: • The linguistic structure L = (Σ, Q, Ω) where Σ is a finite set of input symbols, Q is a finite set of states, and Ω is a finite set of output symbols of the automaton G; • The state structure S = (Q, q0 , F ) where q0 is an element from Q that is called the start state and F is a subset of Q that is called the set of final (in some cases, accepting) states of the automaton G; • The action structure, which is traditionally called the transition function (or transition relation in the case of non-deterministic finite automata) of G and has the following form δ :Σ×Q→Q×Ω It can be also represented as two relations/functions — the state transition relation/function δ1 : Σ × Q → Q and the output relation/function δ2 : Σ × Q → Ω. Thus, a finite automaton is a triad G = (L, S, δ). The transition relation/function δ is portrayed by descriptions of separate transitions and each of these descriptions is an instruction (rule) for the automaton functioning. Note that a finite automaton does not have a memory but it is possible to store information in its states.

page 412

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Symmetric Automata and Computations

b4205-v1-ch13

413

Example 2.2. A Turing machine T is also an instruction machine because its functioning is defined by a system of rules (instructions), which have the following form for a Turing machine with one head qa → pbQ. Here, a is the symbol, which is observed by the head of T and changed to the symbol b in the same cell, q is the state of T , which is changed by this rule to the state p, while Q is direction of the move of the head after performing the writing operation. Example 2.3. An inductive Turing machine K is also an instruction machine because its functioning is defined by a system of rules (instructions), which are similar to rules of Turing machines.9, 20, 21 Example 2.4. A random access machine is an instruction machine. Remark 2.1. It is possible to consider instruction machines (automata), in which their processors can observe and/or change symbols in a neighborhood of the cell where they are situated. These machines (automata) form a special subclass of structural machines.13, 14 Remark 2.2. Instruction machines (automata) are controlled by algorithms in the form of instructions (rules). Thus, it is possible to introduce and study instruction machines (automata) of the second or higher order as it is done for conventional algorithms and automata in Refs. [16, 17]. There are scores of categories of instruction machines with various types of instructions. However, it is possible to distinguish three classes of instructions: • Straightforward or prescriptive instructions directly tell what is necessary to do. • Descriptive instructions describe what result it is necessary to obtain. • Implicit instructions have a form of data that can be interpreted as instructions.

page 413

August 2, 2021

414

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

Let us consider some examples. Example 2.5. The transition rule of a finite automaton G or a Turing machine T is a straightforward instruction. Example 2.6. A function in functional programming is a descriptive instruction. Example 2.7. A weight of an artificial neuron in an artificial neural network is an implicit instruction. Now we can rigorously define symmetric automata and machines. 13.3. Structure of Symmetric Automata and Machines In conventional abstract automata, only infware is processed (transformed). Symmetric abstract automata have more possibilities. Definition 3.1. A symmetric abstract machine or symmetric abstract automaton H is an abstract automaton (machine) that processes (transforms) both its infware and software. Let us specify this definition for instruction machines. To be able to perform these actions, symmetric instruction machines have a more advanced than general instruction machines computational architecture. Supposing that instructions are also stored in the memory of the machine, it is possible to do this using different methods: 1. Instructions can be stored in the same memory as processed data while being labeled in a specific way. 2. Instructions can be stored in a separate part of the general memory of the machine. 3. It is possible to have different memories for instructions and processed data. In a similar way, a symmetric automaton (machine) can have: • one processor that processes (transforms) both data and instructions using different rules.

page 414

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Symmetric Automata and Computations

b4205-v1-ch13

415

• two processors one of which processes (transforms) data and another processes (transforms) instructions. • several processors some of which process (transform) data while other ones process (transform) instructions. Symmetric machines with one processor can be naturally represented as machines with two strongly connected processors. Symmetric machines with many processors form the class of multiprocessor symmetric machines, which is studied elsewhere. That is why here we formalize the second approach by the following definition. Definition 3.2. A symmetric instruction machine or symmetric instruction automaton H has five components: 1. The control device CH , which is a finite automaton and represents states of the machine (automaton) H. 2. The data memory WH , which stores data. 3. The instruction memory VH , which stores instructions. 4. The data processor PM , which transforms (processes) information (data) from the memory WM . 5. The instruction processor DM , which transforms (processes) information (instructions) from the memory VM . The memory WM , consists of cells and connections between them. Each cell can be empty or contain a symbol from the alphabet AM of the machine (automaton) M . Often the memory WM is an (actually or potentially) infinite tape, that is, a system in which cells are linearly ordered by their connections. The memory VM , consists of cells and connections between them. Each cell can be empty or contain an instruction for the processor PM . Often the memory VM is an (actually or potentially) infinite tape, that is, a system in which cells are linearly ordered by their connections. On each step of computation, the processor PM can observe one cell from the memory WM at a time, change the symbol in this cell and go to another cell using connections in the memory WM . The performed operation is defined by the instruction observed by the processor DM or in the memory VM .

page 415

August 2, 2021

416

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

On each step of computation, the processor DM , can observe one cell from the memory VM at a time, change the instruction in this cell and go to another cell using connections in the memory VM . Let us consider some important specific types of symmetric instruction machines (automata). One of these types consists of symmetric Turing machines (s-machines) introduced in Refs. [4], [5], and [8]. A symmetric Turing machine (s-machine) is called symmetric because in it the roles of the head and tape are equivalent. Namely, the action of the head on tape in the conventional Turing machine is replaced by the interaction of the head and tape in the symmetric Turing machine. In this symmetric automaton, the head works as the data processor while the tape works as the instruction processor. In our formalization, we do not suppose that a tape works as a processor but simply employ two heads, one of which processes data while another transforms instructions. Note that it is possible to use one head that performs both kinds of operations depending on its state. However, the model with two heads is more flexible and operational. Besides, the term symmetric Turing machine was also used with a different meaning.7, 23 Namely, a Turing machine is symmetric in the sense of Lewis and Papadimitriou if with each rule (transition), its system of rules contains the inverse rule (transition), that is, if it has the rule aq → bpD, then it has also the rule bp → aqC where D and C are opposite moves of the head. To discern these two cases of Turing machines, we suggest calling symmetric Turing machines in the sense of Lewis and Papadimitriou7 by the term instructionally symmetric Turing machines because their sets of instructions are symmetric. At the same time, symmetric Turing machines in the sense of Schroeder4, 5, 8 as well as symmetric Turing machines described below in a more formal way are called operationally symmetric Turing machines because they operate in a symmetric way. It is also necessary to distinguish symmetric machines from timesymmetric machines.23 Thus, we have three classes of symmetric macines (automata): operationally symmetric machines (automata), instructionally symmetric machines (automata), and time-symmetric machines (automata).

page 416

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Symmetric Automata and Computations

b4205-v1-ch13

417

In this context, there are three basic types of operationally symmetric Turing machines. Definition 3.3. An operationally symmetric Turing machine T of the first type has ten components: • The alphabet ΣT of the data symbols of the machine T • The alphabet QT of the state symbols of the machine T , in which the start state q0T and the subset FT of final states are specified • The enumerated set ΨT of the possible combined instructions of the machine T each of which has the form qh ai rt → aj qk rm X.

(13.1)

This instruction means that the symbol ai is changed by the symbol aj , the state qh changes to the state qk , and if the list in the instruction tape LT I contains the instruction rt , then it is changed to the instruction rm . • The control device CT , which is a finite automaton and represents states of the machine T from the set QT . • The data tape LT D , which consists of cells each of which is either empty or stores one symbol from the alphabet ΣT . • The instruction tape LT I , which contains the list of instructions from the set ΨT in cells each of which is either empty or stores one instruction; instructions from LT I determine functioning of T . • The data head hT D , which transforms (processes) information (data) from the tape LT D . • The instruction head hT I , which transforms (processes) information (instructions) from the tape LT I • The selection transition function f s : Σ T × QT → LT I . • The action transition function f a , which is defined by action instructions of the form qh ai fs (qh ai ) → aj qk rm X.

(13.2)

page 417

August 2, 2021

418

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

Here, ai , aj ∈ ΣT ; qh , qk ∈ QT ; rt , rm ∈ ΨT ; and X denotes the move of data head hT D to the right or to the left in the data tape LT D . Heads perform their operations — transformations and selections — according to the rules from ΨM , ΓM and corresponding selection functions fs and fa . Instructions (rules) that are written in the instruction tape LT I are called active. Other instructions (rules) are called passive. Formally, an operationally symmetric Turing machine T of the first type is represented as T = (ΣT , QT , q0T , FT , ΨT , fs , fa , CT , hT I , hT D , LT D , LT I ). Having this structure, an operationally symmetric Turing machine T of the first type functions in the following way. At the beginning, the machine T , or more exactly, its control device CT , is in the start state q0 . Input data are stored in the data tape LT D as a word w0 in the alphabet ΣT . Initial list of instructions is stored in the instruction tape LT I as a word in the alphabet ΨT . The data head hT D observes the first symbol a0 of w0 . The instruction head hT I observes the first instruction in the list stored in LT I . On each step, the machine T performs the following operations. If the data head hT D observes a symbol ai and the control device CT is in the start state qh , then the selection transition function selects an active instruction, i.e., the rule of the form (1) from LT I , the left side of which contains the word qh ai . After this, the selected rulefs (qh ai ) together with the pair qh ai is used as the transition rule of the form (2) changing qh ai to qk aj , fs (qh ai ) in the instruction tape LT I to rm and moving the data head hT D to the left if X is equal to L or to the right if X is equal to R. The machine T stops when it comes to a final state or when there are no convenient instructions to continue computation. When the machine T stops, the datum result of computation is the word in the data tape LT D and the instruction result of computation is the list in the instruction tape LT I . In essence, an operationally symmetric Turing machine T of the first type and as we later see, of all other types, performs two computations — data computation and instruction computation.

page 418

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Symmetric Automata and Computations

b4205-v1-ch13

419

Computers mostly perform data computations and only sometimes instruction computations. Program optimization and translation of a program from a high-level programming language to the machine code (language) are examples of instruction computations. It is also possible to consider operationally symmetric Turing machines of the first type with several data heads and tapes as well as with several instruction heads and tapes. As in the case of conventional Turing machines, an operationally symmetric Turing machine T of the first type is called deterministic if each of its steps is uniquely determined by the observed symbol ai and the state qh . Otherwise, the operationally symmetric Turing machine T is called nondeterministic. An important relation between automata (machines) is operational equivalence. Definition 3.4. (a) Two automata (machines) A and B are operationally equivalent if given the same input, they perform the same operations. (b) Two classes of automata (machines) H and K are operationally equivalent if each automaton in H is operationally equivalent to an automaton in K and vice versa. For instance, a pushdown automaton, which does not use its stack, is operationally equivalent to finite automaton. Analyzing functioning of an operationally symmetric machine of the first type, we can see that if in all rules from Ψ, instructions are not changing, the operationally symmetric machine T of the first type works as a conventional Turing machine. It gives us the following result. Lemma 3.1. Any conventional Turing machine is operationally equivalent to some operationally symmetric machine of the first type. Although they are structurally different, Lemma 3.1 means that conventional Turing machine is a special case of operationally symmetric machines of the first type. Note that in an operationally symmetric Turing machine (s-machine) M defined in Ref. [8] the transition function does not

page 419

August 2, 2021

420

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

depend on and does not change the state of the machine. It depends only on two parameters — data symbols and instructions. These parameters are stored in two memory devices: • A tape for data, which keeps the processed data • A tape for instructions, which keeps the instruction list Each tape has its head (processor). The tape for data has the data head and consists of cells, each of which can contain one symbol from the alphabet of the machine M . The tape for instructions has the instruction head and consists of cells, each of which can contain one instruction from the instruction list of the machine M . It is possible to interpret such transitions as caused by interactions of data with instructions. In this context, the transition function (relation) of the operationally symmetric Turing machine M has the form qh ai rl → aj rm qk XY.

(13.3)

Here, qh and qk are states of A, ai and aj are symbols from the alphabet of M , rl and rm are instructions from the instruction list of M , while X and Y can be one of the following symbols R, L and S. Each group of rules determines functioning of both heads of M . The rule (1) means that if the state of the control device A of the machine M is qh , the data head hD observes the symbol ai in the cell where this head and the instruction head hI observes the instruction rl in the cell where this head is situated, then the state of A becomes qk ; the head hD writes the symbol aj in the cell where it is situated and moves right to the next cell from its tape when X = R, moves left to the next cell from its tape when X = L and stays in the same cell when X = S; while the head hI writes the symbol rm in the cell where it is situated and moves right to the next cell from its tape when Y = R, moves left to the next cell from its tape when Y = L and stays in the same cell when Y = S. Each rule directs one step of computation of the operationally symmetric Turing machine M . However, it is more efficient to separate transformation of data and transformation of instructions as it is done in operationally symmetric Turing machines of the second type.

page 420

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Symmetric Automata and Computations

b4205-v1-ch13

421

Definition 3.5. An operationally symmetric Turing machine T of the second type has eleven components: 1. The alphabet ΣT of the data symbols of the machine T . 2. The alphabet QT of the state symbols of the machine T , in which the start state q0T and the subset FT of final states are specified. 3. The set ΨT of the possible state transitions (basic instructions/rules) of the machine T each of which has the form qh ai → aj qk X.

(13.4)

4. The set ΓT of the action instructions of the machine T each of which has the form qh ai rt → qk rm Y

(13.5)

5. The control device CT , which is a finite automaton and represents states of the machine T from the set QT . 6. The data tape LT D , which consists of cells each of which is either empty or stores one symbol from the alphabet T . 7. The instruction tape LT I , which consists of cells each of which is either empty or stores one basic instruction from the set ΨT . 8. The data head hT D , which transforms (processes) information (data) from the tape LT D . 9. The instruction head hT I , which transforms (processes) information (instructions) from the tape LT I . 10. The selection transition function f s : Σ T × QT → LT I . 11. The selection transformation function f t : Σ T × QT × Ψ T → ΓT . Here, ai , aj ∈ ΣT , qh ; qk ∈ QT ; rt ,rm ∈ ΨT ; Y denotes the move of instruction head hT I to the right or to the left in the instruction tape LT I and X denotes the move of data head hT D to the right or to the left in the data tape LT D . Instructions (rules) that are written in the instruction tape LT I are called active. Other instructions (rules) are called passive.

page 421

August 2, 2021

422

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

Formally, an operationally symmetric Turing machine T of the second type is represented as T = (ΣT , QT , q0T , FT , ΨT , ΓT , fs , ft , CT , hT I , hT D , LT D , LT I ). Having this structure, an operationally symmetric Turing machine T of the second type functions in the following way. At the beginning, the machine T , or more exactly, its control device CT , is in the start state q0 . Input data are stored in the data tape LT D as a word w0 in the alphabet ΣT . Initial list of instructions is stored in the instruction tape LT I as a word in the alphabet ΨT . The data head hT D observes the first symbol a0 of w0 . The instruction head hT I observes the first instruction in the list stored in LT I . On each step, the machine T performs the following operations. If the data head hT D observes a symbol ai , the instruction head hT I observes an instruction rt and the control device CT is in the start state qh , then the selection transition function fs selects the rule of the form (13.3) from LT I , the left side of which is equal to qh ai , while the selection transition function ft selects the rule of the form (13.4), the left side of which is equal to qh ai rt . After this, the selected rule fs (qh ai ) together with the pair qh ai is used in the relevant rule of the form (13.3) changing qh ai to qk aj and moving the data head hT D to the left if X is equal to L or to the right if X is equal to R. At the same step, the triad qh ai rt is used in the relevant rule of the form (13.4) changing rt to rm and moving the instruction head hT I to the left if Y is equal to L or to the right if X is equal to R. The machine T stops when it comes to a final state or when there are no convenient instructions to continue computation. When the machine T stops, the datum result of computation is the word in the data tape LT D and the instruction result of computation is the list of instructions in the tape LT I . As in the case of conventional Turing machines, an operationally symmetric Turing machine T of the second type is called deterministic if each of its steps is uniquely determined by the observed symbol ai and the state qh . Otherwise, the operationally symmetric Turing machine T is called nondeterministic.

page 422

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Symmetric Automata and Computations

b4205-v1-ch13

423

We see that while an operationally symmetric Turing machine T of the first type changes the instruction that is applied at this step to data symbol and the state, an operationally symmetric Turing machine P of the second type changes the instruction that is observed by the instruction head hT D in the instruction tape LT I . In a general case, applied instruction and observed instruction are different. Thus, operationally symmetric Turing machines of the first and second types work in a different way. However, we can program an operationally symmetric Turing machine P of the second type so that the instruction head always moves to the instruction cell, the content of which is applied to data at this step. Consequently, the instruction head always observes the instruction, which is applied to data at this step. This feature will make the machine P work exactly as an operationally symmetric Turing machine T of the first type. In such a way, we obtain the following result. Proposition 3.1. Any operationally symmetric Turing machine of the first type is operationally equivalent to an operationally symmetric machine of the second type. Note that it is also possible to consider operationally symmetric Turing machines of the second type with several data heads and tapes as well as with several instruction heads and tapes. It is easy to observe that if in all rules from Ψ, instructions are not changing, the operationally symmetric machine T of the second type works as a conventional Turing machine. It gives us the following result. Lemma 3.2. Any conventional Turing machine is operationally equivalent to an operationally symmetric machine of the second type. In essence, Lemma 3.2 means that conventional Turing machine is a special case of operationally symmetric machines of the second type. There is also another way to organize functioning of operationally symmetric Turing machines. This brings us to the third type of operationally symmetric Turing machines.

page 423

August 2, 2021

424

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

Definition 3.6. An operationally symmetric Turing machine T of the third type has twelve components: 1. The alphabet ΣT of the data symbols of the machine T . 2. The alphabet QT of the state symbols of the control devices of the machine T as the union QDT ∪ QIT where QDT is the alphabet of the state symbols of the data control device, for which the start state q0DT and the subset FDT of final states are specified, and QIT is the alphabet of the state symbols of the data control device, for which the start state q0IT and the subset FIT of final states are specified. 3. The set ΨT of the possible state transitions of the machine T each of which has the form qh ai → aj qk X.

(13.6)

4. The set ΓT of the possible instruction transitions of the machine T each of which has the form ph rt → pl rm Y.

(13.7)

5. The data control device CT D , which is a finite automaton and represents states of the machine T from the set QDT . 6. The instruction control device CT I , which is a finite automaton and represents states of the machine T from the set QIT . 7. The data tape LT D , which consists of cells each of which is either empty or stores one symbol from the alphabet ΣT . 8. The instruction tape LT I , which consists of cells each of which is either empty or stores one instruction from the set ΨT . 9. The data head hT D , which transforms (processes) information (data) from the tape LT D . 10. The instruction head hT I , which transforms (processes) information (instructions) from the tape LT I 11. The selection transition function f s : Σ T × QT → LT I .

page 424

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

Symmetric Automata and Computations

425

12. The selection transformation function, which is defined by instructions of the form f t : QT × Ψ T → Ψ T . Here, ai , aj ∈ ΣT ; qh , qk ∈ QT D ; ph , pl ∈ QT I ; rt ,rm ∈ ΨT ; Y denotes the move of instruction head hT I to the right or to the left in the instruction tape LT I and X denotes the move of data head hT D to the right or to the left in the data tape LT D . Formally, an operationally symmetric Turing machine T of the third type is represented as T = (ΣT , QDT , q0DT , FDT , QIT , q0IT , FIT , ΨT , ΓT , fs , ft , CDT , CIT , hT I , hT D , LT D , LT I ). Instructions (rules) that are written in the instruction tape LT I are called active. Other instructions (rules) are called passive. Note that in a general case, the operationally symmetric Turing machine T of the third type is nondeterministic because transformations of data and instructions are performed independently. An operationally symmetric Turing machine T of the third type functions in the following way. At the beginning, the control device CT D is in the start state q0 and the control device CT I is in the start state p0 . Input data are stored in the data tape LT D as a word w0 in the alphabet ΣT . Initial list of instructions is stored in the instruction tape LT I as a word in the alphabet ΨT . The data head hT D observes the first symbol a0 of w0 and the instruction head hT I observes the first instruction stored in the instruction tape LT I . The instruction head hT I observes the first instruction in the list stored in LT I . On each step, the machine T performs the following operations. If the data head hT D observes a symbol ai and the control device CDT is in the start state qh , then the selection transition function fs selects the rule of the form (13.6) from LT I , the left side of which is equal to qh ai . After this, the selected rule fs (qh ai ) together with the pair qh ai is used in the relevant rule of the form (13.4) changing qh ai to qk aj and moving the data head hT D to the left if X is equal to L or to the right if X is equal to R. At the same time, or more exactly,

page 425

August 2, 2021

426

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

independently, the instruction head hT I performs its operations. Namely, if the instruction head hT I observes an instruction rt and the control device CIT is in the start state ph , then the selection transformation function ft selects the rule of the form (13.7), the left side of which is equal to ph rt . After this, the selected rule ft (ph rt ) together with the pair ph rt is used in the relevant rule of the form (13.7) changing ph rt to pl rm and moving the data head hT D to the left if X is equal to L or to the right if X is equal to R. In addition, the pair qh rt is used in the relevant rule of the form (13.7) changing rt to rm and moving the instruction head hT I to the left if Y is equal to L or to the right if X is equal to R. The machine T stops when it either comes to a final state from FDT or does not have relevant active rules to continue computation. When the machine T comes to a final state from FIT or does not have relevant rules to change instructions, then it continues working with the constructed system of instructions without changing them. When the machine T stops, the datum result of computation is the word in the data tape LT D and the instruction result of computation is the list in the instruction tape LT I . The operationally symmetric Turing machine T of the third type functions in a similar way to an operationally symmetric Turing machine of the second type. However, in comparison with an operationally symmetric Turing machine of the second type, the operationally symmetric Turing machine T of the third type changes data and instructions independently. This makes operationally symmetric Turing machines of the third type more flexible in comparison with operationally symmetric Turing machines of the second type. Consequently, reflexive Turing machines introduced and studied in Refs. [2, 3] are operationally symmetric Turing machines of the third type. Remark 3.1. Operationally symmetric Turing machines and operationally symmetric inductive Turing machines are multiprocessor abstract automata, in which one processor transforms data and another one transforms rules.24, 25 More exactly, operationally symmetric Turing machines and operationally symmetric inductive

page 426

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

Symmetric Automata and Computations

427

Turing machines of the first and second types are multiprocessor devices while operationally symmetric Turing machines and operationally symmetric inductive Turing machines of the third type are multicomputer devices. We can see that if in all rules from the set Ψ, which include working symbols, names of states and instructions, instructions are not changing, the operationally symmetric machine T of the third type works as a conventional Turing machine. It gives us the following result. Lemma 3.3. Any conventional Turing machine is operationally equivalent to some operationally symmetric machine of the third type. In essence, Lemma 3.3 means that conventional Turing machine is a special case of operationally symmetric Turing machines of the third type. It is natural to consider operationally symmetric Turing machines of the third type, in which each step of operation includes one operation with data and one operation with instructions. We call them parallel symmetric Turing machines of the third type. There are close relations between parallel symmetric Turing machines of the third type and operationally symmetric machines of the second type. Proposition 3.2. (a) Any parallel symmetric Turing machine of the third type is operationally equivalent to an operationally symmetric machine of the second type. (b) Any symmetric Turing machine of the second type is operationally equivalent to an operationally symmetric machine of the third type. Proof. (a) At first, let us take a parallel symmetric Turing machine T of the third type and construct an operationally symmetric Turing machine M of the second type, which is operationally equivalent to T . By definition, the machine T is represented as T = (ΣT , QDT , q0DT , FDT , QIT , q0IT , FIT , ΨT , ΓT , fs , ft , CDT , CIT , hT I , hT D , LT D , LT I ).

page 427

August 2, 2021

428

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

We have to construct M as the following system of components M = (ΣM , QM , q0M , FM , ΨM , ΓM , fs , ft , CM , hM I , hM D , LM D , LM I ). To have the necessary machine M , we define its components in the following way: ΣM = ΣT , QM = QDT × QIT , q0M = (q0DT , q0IT ), FM = FDT × FIT . The control device CM is defined by its states QM , q0M , FM and rules that change its states. As a result, it is isomorphic to CDT × CIT because QM = QDT × QIT and FM = FDT × FIT . The situation when physically (symbolically) different physical (symbolic) devices are isomorphic, that is, have the same structure is denoted by the symbol ∼ = CDT × CIT . =. Thus, we have CM ∼ The heads hM I and hM D and the tapes LM D and LM I are defined by rules from ΨM , ΓM and corresponding selection functions. The heads and tapes of the machine T have the same structure as the corresponding heads and tapes of the machine M . In other words, hT I ∼ = hM I , hT D ∼ = hM D , LT D ∼ = LM D and LT I ∼ = LM I . Rules in the set ΨT of the machine T of the third type have the form qh ai → aj qk X and rules in the set ΓT of the machine T have the form ph rt → pl rm Y. At the same time, rules in the set ΨM of the machine M of the second type have the form qh ai → aj qk X and rules in the set ΓM of the machine M have the form qh ai rt → qk rm Y. Thus, we can define rules for the machine M of the second type, which determine its functioning, in the following way: ΨM = {(qh , p)ai → aj (qk , p)X; p ∈ QIT & qh ai → aj qk X ∈ ΨT }, ΓM = {(q, ph )art → (q, pl )rm Y ; q ∈ QDT & a ∈ ΣT & ph rt → pl rm Y ∈ ΓT & ph ∈ / FIT }. By definition, rules from ΨM do not change elements from the component QIT in QM while rules from ΓM do not change elements

page 428

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

Symmetric Automata and Computations

429

from the component QDT in QM . As a result, these rules work independently in the parallel mode as they are performed in the functioning of the parallel operationally symmetric Turing machine T of the third type. Thus, the machine M is operationally equivalent to the machine T . (b) Let us take an operationally symmetric Turing machine M of the second type and construct an operationally symmetric Turing machine T of the third type, which is operationally equivalent to M . By definition, the machine M is represented as M = (ΣM , QM , q0M , FM , ΨM , ΓM , fs , ft , CM , hM I , hM D , LM D , LM I ). We have to construct T as the following system of components: T = (ΣT , QDT , q0DT , FDT , QIT , q0IT , FIT , ΨT , ΓT , fs , ft , CDT , CIT , hT I , hT D , LT D , LT I ). To have the necessary machine T , we define the components of T in the following way: ΣT = ΣM , QDT = QIT = QM , q0DT = q0IT = q0M , FDT = FIT = FM , CDT ∼ = CIT ∼ = CM , hT I ∼ = hM I , hT D ∼ = hM D , LT D ∼ = LM D , LT I ∼ = LM I . As before, the symbol ∼ = means that physically (symbolically) different physical (symbolic) devices are isomorphic, that is, have the same structure. Thus, both control devices CDT and CIT are isomorphic to CM and defined by the states QM , q0M , FM and rules

page 429

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

430

that change these states, that is, CDT ∼ = CIT ∼ = CM . In the same way, functioning of the heads hT I and hT D and the tapes LT D and LT I are defined by rules from ΨM , ΓM and corresponding selection functions. The heads and tapes of the machine T have the same structure as the corresponding heads and tapes of the machine M . In other words, hT I ∼ = hM I , hT D ∼ = hM D , LT D ∼ = LM D and LT I ∼ = LM I . Rules in the set ΨM of the machine M of the second type have the form qh ai → aj qk X and rules in the set ΓM of the machine M have the form qh ai rt → qk rm Y. At the same time, rules in the set ΨT of the machine T of the third type have the form qh ai → aj qk X and rules in the set ΓT of the machine T have the form ph rt → pl rm Y. Thus, we can define rules for the machine T of the third type, which determine its functioning, in the following way: ΨT = ΨM , ΓT = {ph rt → pl rm Y if rt = (qh ai → aj qk X) and qh ai rt → qk rm Y ∈ ΓM for all rt ∈ ΨM }. Rules ph rt → pl rm Y and qh ai → aj qk X are applied at the same time, that is, in a parallel way. As a result, these rules determine the same operations as the rules qh ai → aj qk X and qh ai rt → qk rm Y of the machine M . Thus, the machine T is operationally equivalent to the machine M . Proposition is proved As the operationally symmetric Turing machine T of the third type constructed in the proof of Proposition 3.2.b is parallel, we have the following result. Corollary 3.1. Classes of parallel operationally symmetric Turing machines of the third type and of operationally symmetric machines of the second type are operationally equivalent. Now let us consider symmetric finite automata. The structure of a finite automaton allows two forms of symmetric finite automata (SFA) — split symmetric finite automaton (SSFA) and join symmetric finite automaton (SFA).

page 430

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Symmetric Automata and Computations

b4205-v1-ch13

431

Definition 3.7. A split symmetric finite automaton (SSFA) A consists of three components: 1. The linguistic structure L = (Σ, Q, Ω), where Σ is a finite set of input symbols, Q is a finite set of states, and Ω is a finite set of output symbols of the automaton A. 2. The state structure S = (Q, q0 , F ), where q0 is an element from Q that is called the start state and F is a subset of Q that is called the set of final (in some cases, accepting) states of the automaton A. 3. The action structure, which consists of the following sets: (a) The list of transitions L = {tr1 , tr2 , . . . , trn } where each transition has the form qa → pb. Here, a is an input symbol from Σ, b is an output symbol from Ω, while q and p are states from Q. (b) The list of transition transformations P = {tm1 , tm2 , . . . , tmk }, where each transformation has the form qa[se → pb] → [rc → td] Here, a, e and c are input symbols from Σ, b and d are output symbols from Ω, while q, s, r, t and p are states from Q. (c) Selection transition function fdA : Σ × Q → L. (d) Selection transformation function ftA : Σ × Q → P. A split symmetric finite automaton A functions in the following way. At the beginning, the automaton A is in the start state q0 . When an input symbol a comes to the automaton A, which is in a state q, the selection transition functionfdA chooses the rule with the left side qa from L, for example, qa → pb, the selection transformation function ftA chooses the rule with the left side containing qa from P ,

page 431

August 2, 2021

432

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

for example, qa [se → pb] → [rc → td], and the automaton A performs actions according to these rules, for example, giving output b and changing the rule se → pb by the rule rc → td in the list P . A join symmetric finite automaton has a similar structure but functions in a different way. Definition 3.8. A join symmetric finite automaton (SSFA) A consists of three components: 1. The linguistic structure L = (Σ, Q, Ω), where Σ is a finite set of input symbols, Q is a finite set of states, and Ω is a finite set of output symbols of the automaton A. 2. The state structure S = (Q, q0 , F ), where q0 is an element from Q that is called the start state and F is a subset of Q that is called the set of final (in some cases, accepting) states of the automaton A. 3. The action structure, which consists of the following sets: (a) The list of transitions L = {tr1 , tr2 , . . . , trn } where each transition has the form qa → pb. Here, a is an input symbol from Σ, b is an output symbol from Ω, while q and p are states from Q. (b) The list of transition transformations P = {tm1 , tm2 , . . . , tmk } where each transformation has the form qa[se → pb] → [rc → td] Here a, e and c are input symbols from Σ, b and d are output symbols from Ω, while q, s, r, t and p are states from Q. (c) Selection transition function fdA : Σ × Q → L (d) Selection transformation function ftA : Σ × Q → P An split symmetric finite automaton A functions in the following way.

page 432

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

Symmetric Automata and Computations

433

At the beginning, the automaton A is in the start state q0 . When an input symbol a comes to the automaton A, which is in a state q, the selection transition function fdA chooses the rule with the left side qa from L, for example, qa → pb, the selection transformation function ftA chooses the rule with the left side qa [qa → pb] from P , for example, qa [qa → pb] → [rc → td], and the automaton A performs actions according to these rules, e.g., giving output b and changing the rule qa → pb by the rule rc → td in the list P . Symmetric finite automata allow building operationally symmetric cellular automata, in which all elements are symmetric finite automata. Note that there are symmetric finite automata studied in Ref. [26], which by definition are cellular automata with permutation-invariant local rules acting on symmetric lattices. To discern them from operationally symmetric cellular automata, it would be reasonable to call them structurally symmetric cellular automata. 13.4. Functional Characteristics of Operationally Symmetric Turing Machines There are various modes of information processing in abstract automata, material computers and networks. For instance, in Ref. [27], eight internal modes of abstract automata, computer and network functioning are described. Computational and networking practice shows that taking into account modes of information processing is important for efficient design of distributed systems. Here, we consider the basic internal modes of automaton functioning. • Reactive mode is when given some input, an automaton directly gives the corresponding output. • Recursive mode is built on recursion in processing input and giving the corresponding output. • Inductive mode is built on induction in processing input and giving the corresponding output.

page 433

August 2, 2021

434

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

Examples of automata working in the reactive (subrecursive) mode are deterministic finite automata, logical gates and nondeterministic finite automata. Examples of automata and algorithms working in the recursive mode are Turing machines, Kolmogorov algorithms, Minsky machines, storage modification machines, random access machines (RAM), Petri nets, cellular automata and partial recursive functions. Examples of automata working in the inductive mode are inductive Turing machines,20 evolutionary inductive Turing machines,28 evolutionary finite automata working in the inductive mode,29 inductive cellular automata30 and inductive register machines.31 There are different techniques to organize program formation in symmetric machines. The basic types of them are as follows: • Prior program formation (PPF) implies that the program, that is, a set of instructions for the data processor, is prepared before the main computation. • Concurrent program formation (CPF) implies that the program, that is, a set of instructions for the data processor, is changing at the same time as the main computation goes. • Interval program formation (IPF) implies that the program, that is, a set of instructions for the data processor, is not changing during definite intervals in the main computation. Each of these types can be divided in several subclasses. Namely, there are three sorts of concurrent program formation (CPF): 1. Parallel program formation (PPF) means that the program, i.e., a set of instructions for the data processor, is changing on each step of the main computation, i.e., the data processor and instruction processor are synchronized. 2. Independent program formation (ITPF) means that the data processor and instruction processor work at the same time; sometimes the machine has to wait when the necessary instruction is formed 3. Mixed program formation (MPF) means that in the main computation, the data processor and instruction processor can be synchronized only during some time.

page 434

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

Symmetric Automata and Computations

435

There are also three sorts of interval program formation (CPF): 1. Oscillating program formation means that each step of data processing is followed by instruction processing step and vice versa. 2. Regular interval program formation means that all intervals of data and instruction processing have the same length. 3. Casual program formation means that there are no restrictions on the interchanging intervals of processing. Let us consider some examples. Example 4.1 (See Ref. [2]). Depending on the organization of their functioning, reflexive Turing machines perform either concurrent or interval program formation. Example 4.2. Computers perform prior program formation when they use compilers. Example 4.3. Computers perform concurrent program formation when they use interpreters. Operationally symmetric Turing machines have many properties similar to the properties of Turing machines. For instance, we have the following result. Theorem 4.1. (a) There is a universal operationally symmetric Turing machine U of the first type. (b) There is a universal operationally symmetric Turing machine V of the second type. (c) There is a universal operationally symmetric Turing machine W of the third type. In all three cases, the proof is similar to the proof of the existence of a universal Turing machine described, for example, in Ref. [32]. Doing this, we at first construct a coding by words of all operationally symmetric Turing machines of the corresponding type. After this, we show how a universal operationally symmetric Turing machine of this type can simulate an arbitrary operationally symmetric Turing machine using its code.

page 435

August 2, 2021

436

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

As in the case of Turing machines, universal operationally symmetric Turing machines are useful for studying decidability and computability problems for operationally symmetric Turing machines. However, these problems are studied elsewhere. Theorem 4.2. Functioning of an operationally symmetric Turing machine of the first type can be simulated by a Turing machine with eight tapes and heads. Proof. Given an operationally symmetric Turing machine T of the first type, we build a Turing machine M , which simulates T . The machine M has five input tapes: the data input tape LM ID ; the input instruction tape LM II , which contains the list of initial basic instructions of the machine T ; the total instruction tape LM IA , which contains the list of all basic instructions of the machine T ; the input action tape LM II , which contains the list of action instructions of the machine T ; and the input selection tape LM IS , which contains the description of the selection transition function of the machine T . The machine M also has three working tapes: the working data tape LMWD , which represents the data tape LT D of the machine T ; the working instruction tape LM W I , which represents the instruction tape LT I of the machine T ; and the support working tape LM W , which is used for modeling transitions of the machine T . Functioning of the Turing machine M is similar to functioning of a deterministic Turing machine simulating a nondeterministic Turing machine. It is possible to find a detailed description of this process, for example, in Ref. [32]. The same result is true for operationally symmetric Turing machines of the second type. Theorem 4.3. Functioning of an operationally symmetric Turing machine of the second type can be simulated by a Turing machine. The proof is similar to the proof of Theorem 4.2. Theorem 4.3 and Proposition 3.1 imply the following result. Corollary 4.1. Functioning of an operationally symmetric Turing machine of the third type can be simulated by a Turing machine.

page 436

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

Symmetric Automata and Computations

437

In addition to operational equivalence, there are other types of equivalences between classes of automata (machines). Definition 4.1 (See Ref. [33]). Two classes of automata (machines) H and K are functionally equivalent if they compute the same class of functions. For instance, Turing machines with one tape and 1D cellular automata are functionally equivalent.9 Lemma 4.1. If the results of computations are defined in the same way for two classes of automata machines or algorithms, then their operational equivalence implies their functional equivalence. Theorem 4.2 and Lemmas 3.1 and 4.1 imply the following result. Corollary 4.2. Operationally symmetric Turing machines of the first type are functionally equivalent to Turing machines. Theorem 4.2 and Lemmas 3.2 and 4.1 imply the following result. Corollary 4.3. Operationally symmetric Turing machines of the second type are functionally equivalent to Turing machines. Theorem 4.2, Lemmas 4.1 and 3.3 imply the following result. Corollary 4.4. Parallel operationally symmetric Turing machines of the third type are functionally equivalent to Turing machines. As equivalence is a symmetric relation, Corollaries 4.2–4.4 imply the following result. Corollary 4.5. Operationally symmetric Turing machines of the first, second and parallel operationally symmetric Turing machines of the third types are functionally equivalent. There is one more important type of automaton (machine) equivalence. Definition 4.2 (See Ref. [33]). Two classes of automata H and K are linguistically equivalent if they compute (or accept) the same class of languages.

page 437

August 2, 2021

438

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

For instance, deterministic accepting finite automata and nondeterministic accepting finite automata are linguistically equivalent.32 Lemma 4.2 (See Ref. [33]). Functional equivalence implies linguistic equivalence. Lemma 4.2 and Corollaries 4.2–4.4 imply the following results. Corollary 4.6. Operationally symmetric Turing machines of the first type are linguistically equivalent to Turing machines. Corollary 4.7. Operationally symmetric Turing machines of the second type are linguistically equivalent to Turing machines. Corollary 4.8. Parallel operationally symmetric Turing machines of the third type are linguistically equivalent to Turing machines. Corollary 4.9. Operationally symmetric Turing machines of the first and second types and parallel operationally symmetric Turing machines of the third types are linguistically equivalent. There is a natural relation between parallel operationally symmetric Turing machines of the third type and Turing machines. Theorem 4.4. Functioning of a parallel operationally symmetric Turing machine of the third type can be simulated by a Turing machine. To prove this theorem, we use Proposition 3.2, according to which any parallel symmetric Turing machine of the third type is operationally equivalent to an operationally symmetric machine of the second type. By Theorem 4.3, functioning of an operationally symmetric Turing machine of the second type can be simulated by a Turing machine. Consequently, functioning of a parallel operationally symmetric Turing machine of the third type can be simulated by a Turing machine. Remark 4.1. Theorem 4.4 means that parallel symmetric Turing machines of the third type are recursive algorithms. However, in a general case, when an operationally symmetric Turing machine

page 438

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Symmetric Automata and Computations

b4205-v1-ch13

439

processes data and instructions in the concurrent mode, this machine can become superrecursive and thus, cannot be simulated by a Turing machine. In what follows let us assume that all considered here Turing machines and symmetric Turing machines work with data alphabets each symbol of which is processed, for example, written or erased, in one step of the machine functioning. Theorem 4.4 shows that parallel symmetric Turing machines of the third type have the same computing power as Turing machines. At the same time, it is also possible to prove that parallel symmetric Turing machines of the third type can be much more efficient than Turing machines. Reflexive Turing machines are parallel operationally symmetric Turing machines of the third type. This allows us to prove the following theorem using results from Refs. [2, 3]. Theorem 4.5. For any natural number k, any universal Turing machine U and any program p for U with time complexity TU,p (n) ≥ n, there is a parallel symmetric Turing machine M of the third type that can perform the same computations with time complexity TM,p (n) which is asymptotically less than (1/2)−k TU,p (n). Allowing utilization of memory and processors that work with infinite alphabets, each symbol of which is processed, for example, written or erased, in one step of the machine functioning, it is possible to prove even a stronger result. Theorem 4.6. For any universal Turing machine A with time complexity TA (n) ≥ n, there is a parallel symmetric Turing machine B of the third type that can perform the same computations with time complexity TB (n) which is asymptotically less than (1/2)−k TA (n) for any natural number k. Note that here we use only potentially infinite alphabets and not actually infinite ones such as, for example, the set of all natural numbers or the set of all real numbers.

page 439

August 2, 2021

440

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

These results show that although symmetric Turing machines of the third type have the same computing power as Turing machines, they can be essentially more efficient. Further considerations of symmetric automata show that there are intrinsic relations between operationally symmetric Turing machines and evolutionary Turing machines. We remind the definition of an evolutionary Turing machine.28, 29 An evolutionary Turing machine (ETM ) E = {TM[t]; t = 0, 1, 2, 3, . . .} is a series of (possibly infinite) Turing machines TM[t] each working on population X[t] in generations t = 0, 1, 2, 3, . . . where • each δ[t] transition function (rules) of the Turing machine TM[t] represents (encodes) an evolutionary algorithm that works with the population X[t], which evolved in generations 0, 1, 2, . . . , t; • only generation X[0] is given in advance, and any other generation depends on its predecessor only, i.e., the outcome of the generation t = 0, 1, 2, 3, . . . is the population X[t + 1] obtained by applying the recursive variation v and selection s operators working on population X[t]; • operators of recursive variation v and selection s are realized by the transition function δ[t] of each Turing machine TM[t], t = 0, 1, 2, 3, . . .; • TM[0] is the initial Turing machine operating on the initial population X[0] given as its input; • the goal (or halting) state of ETM E is represented by any population X[t] satisfying the termination condition. The desirable termination condition is the optimum of the fitness performance measure f (x[t]) of the best individual from the population X[t]; • When the termination condition is satisfied, then the ETM E halts (t stops to be incremented), otherwise a new input population X[t + 1] is generated by TM[t]. Observing the functioning of operationally symmetric Turing machines and comparing it with the functioning of evolutionary Turing machines, we see that each time, an instruction (rule) in the tape of instructions (rules) changes, it is possible to treat this

page 440

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Symmetric Automata and Computations

b4205-v1-ch13

441

transformation as transition to a new conventional Turing machine. This gives us the following result. Theorem 4.7. Functioning of an operationally symmetric Turing machine of the first and second types can be simulated by an evolutionary Turing machine with the same complexity. Note that when a Turing machine simulates an operationally symmetric Turing machine, it takes many steps of this Turing machine to transform instructions. In comparison with this, in a simulating evolutionary Turing machine, this change is performed in one step. 13.5. Conclusion We introduced concepts of operationally symmetric automata, instruction machines and operationally symmetric instruction machines. A complete formalization of operationally symmetric Turing machines was elaborated resulting in three distinct types of these machines. Relations between these types of operationally symmetric Turing machines and Turing machines were established and different properties of operationally symmetric Turing machines were obtained. All this opens new directions for further research. Here, we only defined two classes of symmetric finite automata. It would be interesting to study their properties and to find whether there are other classes of symmetric finite automata in addition to the classes described here. Inductive Turing machines form a powerful class of algorithms.20 Thus, formalization and exploration of operationally symmetric inductive Turing machines is a motivating problem for future research. It is possible to study the same problem for periodic Turing machines,34 which are intrinsically related to inductive Turing machines. Structural machines provide an extremely efficient model of computation.13, 14 It would be interesting to introduce and study symmetric structural machines.

page 441

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

442

b4205-v1-ch13

M. Burgin

We proved that operationally symmetric Turing machines of the second and third types can essentially decrease time complexity. It is an appealing problem to find if these machines can also decrease Kolmogorov (algorithmic) complexity. Knowing that operationally symmetric Turing machines of the second and third types can essentially outperform Turing machines, it would be attractive to find if the same is true for operationally symmetric Turing machines of the first type. It would be also interesting to study properties of symmetric evolutionary Turing machines, and operationally symmetric cellular automata. References 1. S. Kleene, Constructive and non-constructive operations. In Proceedings of the International Congress of Mathematicians (Cambridge University Press, Edinburg, 1958, 1960). 2. M. Burgin, Reflexive calculi and logic of expert systems. In Creative Processes Modeling by Means of Knowledge Bases (Sofia, 1992), pp. 139– 160 (in Russian). 3. M. Burgin, Reflexive Turing Machines and Calculi, Vychislitelnyye Systemy (Logical Methods in Computer Science) 148, 94–116, 175–176 (1993). 4. M. J. Schroeder, Dualism of selective and structural manifestations of information. In Modelling of Information Dynamics, Computing Nature, SAPERE 7 (Springer, Berlin, Germany, 2013), pp. 125–137. 5. M. J. Schroeder, From proactive to interactive theory of computation. In The 6th AISB Symposium on Computing and Philosophy: The Scandal of Computation — What is Computation? (2013), pp. 47–51. 6. M. Burgin, Information processing by symmetric inductive Turing machines. Proceedings 48, 28 (2020). doi: 10.3390/proceedings47010028 7. H. R. Lewis and C. H. Papadimitriou, Symmetric space-bounded computation. Theoret. Comput. Sci. 161–187 (1982). 8. M. J. Schroeder, Computing with nature. Proceedings 1(3) (2017). 9. M. Burgin, Superrecursive Algorithms (Springer-Verlag, New York, 2005). 10. A. N. Kolmogorov, On the concept of algorithm. Uspekhi Mat. Nauk. 8(4), 175–176 (1953). 11. A. Sh¨ onhage, Storage modification machines. SIAM J. Comput. 9, 490–508 (1980). 12. J. Kleinberg and E. Tardos, Algorithm Design (Pearson, Addison-Wesley, 2006). 13. M. Burgin and A. Adamatzky, Structural machines and slime mold computation. Int. J. Gen. Syst. 45(3), 201–224 (2017).

page 442

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Symmetric Automata and Computations

b4205-v1-ch13

443

14. M. Burgin and A. Adamatzky, Structural machines as a mathematical model of biological and chemical computers. Theory Appl. Math. Comput. Sci. 7(2), 1–30 (2017a). 15. S. Haykin, Neural Networks: A Comprehensive Foundation (Macmillan, New York, 1994). 16. M. Burgin and N. Debnath, Reusability as design of second-level algorithms. In Proceedings of the ISCA 25th International Conference on Computers and their Applications (CATA-2010), ISCA, Honolulu, Hawaii (2010), pp. 147–152. 17. M. Burgin and B. Gupta, Second-level algorithms, superrecursivity, and recovery problem in distributed systems. Theory Comput. Syst. 50(4), 694– 705 (2012). 18. A. Graves, G. Wayne, and I. Danihelka, Neural Turing Machines (2014), arXiv:1410.5401. 19. M. Collier and J. Beel, Implementing neural turing machines. Artif. Neural Netw. Mach. Learn. — ICANN 2018, pp. 94–104 (2018). 20. M. Burgin, Nonlinear phenomena in spaces of algorithms. Int. J. Comput. Math. 80(12), 1449–1476 (2003). 21. M. Burgin, Inductive turing machines. In A. Adamatzky (ed.), Unconventional Computing a Volume in the Encyclopedia of Complexity and Systems Science (Springer, Berlin/Heidelberg, 2018), pp. 675–688. 22. O. Reingold, Undirected connectivity in log-space, J. ACM 55(4), 1–24 (2008). 23. M. Kutrib and T. Worsch, Time-symmetric machines. In International Conference on Reversible Computation (RC 2013), LNCS, vol. 7948, (2014), pp. 168–181. 24. R. M. Fujimoto and D. A. Reed, Multicomputer Networks: Message-based Parallel Processing (MIT Press, Cambridge, MA, 1988). 25. S. G. Shiva, Advanced Computer Architectures (CRC Press, 2005). 26. V. V. Kornyak, Symmetric cellular automata. Programm. Comput. Software 33(2), 87–93 (2007). 27. M. Burgin, Super-recursive algorithms and modes of computation. In Proceedings of the 2015 European Conference on Software Architecture Workshops, ACM, Dubrovnik/Cavtat, Croatia, September 7–11, 2015, (2015), pp. 10:1–10:5. 28. M. Burgin and E. Eberbach, On foundations of evolutionary computation: An evolutionary automata approach. In Hongwei Mo (ed.), Handbook of Research on Artificial Immune Systems and Natural Computing: Applying Complex Adaptive Technologies Section II: Natural Computing, Section II.1: Evolutionary Computing, Chapter XVI (Medical Information Science Reference/IGI Global, Hershey, Pennsylvania, 2009), pp. 342–260. 29. M. Burgin and E. Eberbach, Evolutionary automata: Expressiveness and convergence of evolutionary computation. Comput. J. 55(9), 1023–1029 (2012).

page 443

August 2, 2021

444

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch13

M. Burgin

30. M. Burgin, Inductive cellular automata. Int. J. Data Struct. Algorith. 1(1), 1–9 (2015a). 31. M. Burgin, C. S. Calude, and E. Calude, Inductive complexity measures for mathematical problems. Int. J. Found. Comput. Sci. 24(4), 487–500 (2013). 32. J. E. Hopcroft, R. Motwani, and J. D. Ullman, Introduction to Automata Theory, Languages, and Computation (Addison Wesley, Boston/San Francisco/ New York, 2007). 33. M. Burgin, Measuring Power of Algorithms, Computer Programs, and Information Automata (Nova Science Publishers, New York, 2010). 34. M. Burgin, Periodic Turing machines. J. Comput. Technol. Appl. (JoCTA) 5(3), 6–18 (2014).

page 444

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch14

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 0014

Chapter 14

Computation By Biological Means Alexander Hasson∗,‡ and Dan V. Nicolau†,§ ∗

School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia † Mathematical Institute, University of Oxford, Oxford OXI 2JD, UK ‡ [email protected] § [email protected] Biological computers offer reductions in terms of the consumption of energy, physical and spatial resources, as well as compute time. There is evidence to show that some problems, such as those in the hardest complexity classes, are able to be solved efficiently by the unique computational mechanisms involved in computation by biological means. In this chapter we present a systematic evaluation of the most understood biological computing methods; featuring a critique on their usefulness for the various problems and halting points seen by modern computing.

14.1. Introduction 14.1.1. Computation What is a computer? Illustratively, in its original definition, it was nothing other than a person performing calculations manually.a Whether human, machine or — as we here discuss — part-biological and part-synthetic, the term is perhaps best defined functionally: the use of computers is to perform computation, that is, “the action of a

Computer | Definition of Computer by Oxford Dictionary on Lexiec.com also meaning of computer. 445

page 445

August 2, 2021

446

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch14

A. Hasson and D. V. Nicolau

mathematical calculation”.b Anything capable of performing mathematical calculations is then said to be a computer. Since “computer” in modern usage, however, has become almost synonymous with a particular type of silicon-based electronic device, for clarity we will here refer to any biological system able to perform “mathematical calculations” as a biological computer (BC). 14.1.2. Use of biology for computation Computation naturally occurs in the biological (and, it may be argued, purely physical) world. As described, for instance, by de Castro,1 natural computing encompasses three sub-fields: computing methods that are inherently inspired by nature, the use of computers to model natural phenomena and computation methods that rely on the use of natural materials, such as DNA computing, networksbased biocomputation and others. The latter effort — biological computation — which is our focus here, is related to natureinspired computation but is distinguished from it by the direct use of biological “hardware”. In other words, to qualify for biological computation, the use of biological hardware is a sine qua non. 14.1.3. What can biocomputers do? Why use biological hardware and software to carry out computations? In short because — short of the infinitesimally unlikely scenario that computational intractability does not exist — solving “hard” problems fundamental to all human endeavor, including NP-complete problems, will not be possible using serial devices such as electronic computers or choreographed parallelism as in quantum computing but will genuinely require ground-up, massive parallelism.2 By exploiting the intrinsic properties of living systems, in particular their enormous parallelism, it may be that the limits associated with digital computers, for example, time, memory space and power, will be overcome2 or at least ameliorated. b

Computation | Definition of Computation by Merriam-Webster.

page 446

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computation By Biological Means

b4205-v1-ch14

447

From this point of view, we see that, should they “scale”, biocomputer will not replace conventional computers, which are very successful for the tasks they are able to deal with efficiently.3 Rather, BCs may be able to effectively solve problems out of reach of electronic devices, for example finding the Hamiltonian path of a graph.4 Early work on BCs has offered particular hope for finding solutions to such “hard” graph-based problems, with abilities to rapidly explore solutions.5, 6 In what follows, we give a brief overview of some themes in biocomputation as defined above. This is intended simply to give the reader a flavor of the field and is far from exhaustive. Biocomputation being a rapidly developing field, we urge the reader’s indulgence. We point to several books providing in-depth further reading at the end of the chapter. 14.2. DNA Computing 14.2.1. Origins of DNA computing The earliest recognized7 of computation by biological means is DNA computing, described in Adleman’s seminal paper.6 DNA computing exploded onto the computational stage and the field has seen interest in many related mechanisms being explored over the past 25 years. This has led to sub fields, such as the creation of splice-based and sticker-based models of computation with DNA.8, 9 14.2.2. Computing with DNA Deoxyribonucleic acid (DNA) is the genetic material required for all known organisms. The complex machinery that allows the growth, development and function of organisms relies on more than just DNA. Nucleobases, the building blocks of complex DNA structures are supported by other biological “objects”, such as DNA-ligase and DNA-polymerase, enzymes that bind and replicate DNA, as well as nucleases which sever specific target sequences of DNA. Laboratory methods to manipulate DNA have existed for over 100 years, but the ability to create and manipulate recombinant

page 447

August 2, 2021

448

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch14

A. Hasson and D. V. Nicolau

DNA4 opened up vast new possibilities. This includes computations, as described below. Importantly, because it operates near the relevant thermodynamic limits inside living cells, DNA offers incredibly resource-effective computation, with estimates that one litre of DNA can perform all the computation ever done with conventional computers to-date.2 14.2.3. Adleman’s experiment Adleman’s 1994 paper showed that DNA-based computers can solve combinatorial problems and can do so with the same theoretical computational reach as Turing machines.6 This early demonstration focussed on converting to DNA “format” and then solving, in a massively parallel fashion, the famous (and NP-complete) Directed Hamilton Path Problem (DHP), for which no practical algorithm exists in conventional computers (conditional on P!=NP). Many similar NP-complete problems, such as the travelling salesman problem (TSP), can be elegantly converted to a DHP instance. Essentially, this result implied much more: namely, that virtually all combinatorial problems could be solved in this way, a belief which would largely go on to be validated in the years after. Studies on scalability followed. For example, Wang et al. showed how DNA computing (specifically based on ligase chain reactions) were space efficient and error tolerant, unlike conventional bruteforce methods,10 thus suggesting one could scale DNA computing to solve larger and more difficult “hard” problems and even hinting that DNA computing would overtake conventional computing for these problems in short order. The DHP can be thought of as modelling flights to various cities (one stop maximum for each city)6 a known start and end city. Adleman was able to encode the “flight-numbers” of his mock DHP with nucleobases and encode the paths and directions through utilization of the Watson–Crick complimentary pairs. Manipulating ligases, polymerase, and nucleases to rapidly bind, replicate and locate solutions to the DHP was done with gel electrophoresis (GE)11 to find the possible solutions to the DHP. The polymerase chain

page 448

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computation By Biological Means

b4205-v1-ch14

449

reaction (PCR) was used to locate DNA sequences with correct start and end sequences (start and end cities in the DHP). Then, GE was implemented to find the correct solution based on sequence length (equal, of course, to the number of cities). This method has been replicated and improved on by others and is easily reproduced. Adleman’s DHP demonstration was only a prototype but provided definitive proof that DNA computers can work and can be powerful. DNA computing gives hope to biological computing for several reasons. Computing in this way is natural, in the sense that Turing completeness requires only a method to store information, and a set of operations to perform computation on that data.12 Therefore, the prospect of computation by biological means, or “biosupremacy” (see below), is promising. In relation to classical devices, DNA is an incredibly advanced form of data storage. Each double strand of DNA has six possible reading frames, with all translations capable of providing different information. Furthermore, biological enzymes such as ligases already provide us with a set of operations for interacting with this data. Therefore, not only can DNA be used to compute, but Turing complete DNA computers are possible. Finally, though the appeal of DNA computing is that it can solve problems “rapidly”, as mentioned above it is also very efficient in terms of energy. The second law of thermodynamics limits computation to 34×1019 operations per Joule (STP), and Adleman estimated in principle 2 × 1019 operations per Joule would be possible with his method. This efficiency makes a mockery of the top Green500 super computer (2019),c performing at 15.1 × 109 , the same magnitude as standard super computer energy performance in 1994. 14.2.4. Sticker systems DNA computing has expanded to the development of specialized methods to utilize the DNA for computation. One such alternative means is Sticker Systems/Models. Originally proposed by Karl et al., c

Green500 | TOP500 Supercomputer Sites.

page 449

August 2, 2021

450

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch14

A. Hasson and D. V. Nicolau

at least one variant of these systems is proven to be Turing complete.13 Sticker systems provide the benefit of not requiring the enzymes utilized in Adleman’s original experiment, and thus do not have end-to-end sequence extension (due to the absence of ligases). The sticker system relies on extending sequences by partial matches of Watson–Crick complimentary pairs. The system is analogous to a game of dominoes where only one half (or in this case, one subsequence) needs to align for the structure to elongate. Karl et al. were particularly interested in the generative power of this system for use in formal language theory, though recent interest is more based on the finite automation capabilities of sticker systems. Sticker systems have also been shown to have the computational capabilities of splicing systems (the other primary DNA computing model).9 As a computational model, there are several benefits of the sticker system. Firstly, k-bit systems can be “programmed”, allowing inherited benefit for some problems.8 Sticker systems also present a random access memory (RAM) scheme known as “memory strands”. These systems also yield operations such as combining bit-strings, clearing and setting bits, and separating bit-strings. As with Adleman’s original model, this system was shown to solve combinatorial problems. Though the actual computation time is swift, as pointed out by Xu9 the time of setup and extra work makes the model egregious in terms of overall operation speed. Sticker systems, resembling developments in early computer science, demonstrate that abstracting away from the basic computational model is possible. As we are able to look retrospectively to conventional computing, researchers already possess many of the tools and models necessary to create high-level, readily available DNA computers.14, 15 14.2.5. Recent progress on computational power of DNA Currin et al. were able to implement a DNA-based non-deterministic Universal Turing machine (UTM).16 Differentiating themselves from UTM, this type of theoretical computing device can explore many, if not all, solutions simultaneously. As outlined earlier, DNA is capable

page 450

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computation By Biological Means

b4205-v1-ch14

451

of replicating and exploring many paths simultaneously in the case of flight paths. This biological power was instead leveraged to show in vitro computational superiority to both classical and quantum machines. Benefits of the non-deterministic UTM included errorcorrection. 14.2.6. Limitations of DNA Given the theoretical benefits and excitement generated by DNA computing, as well as the plausibility of programmable systems in the near future, it may seem surprising that more progress has not been made recently, outside of specific experimental results. On the other hand, all forms of computation suffer from some physical limit. Clock speeds in conventional processors are limited by temperature, transistors are limited by our abilities to make them smaller, and so on. DNA computing is, as it turns out, severely limited by the amount of biological material required to compute. Conventional electronic computers only have one finite resource as an input, electricity. Depending on which model is implemented, DNA computing has one (sticker systems) to many (Adleman) system requirements. Consider the one input case. DNA synthesis is now a common procedure, but the cost of a reasonable supply of DNA to any device, computational or otherwise, rapidly outweighs the ease of supply of electricity to modern computers. There are also technical issues, generally due to the need, for computing the solutions to hard problems, of handling “exponentiality” with DNA in the presence of finite error rates, for example at readout. For “exponentially” big solution spaces, for example satisfiability (SAT)17 problems, exponentially large amounts of DNA would need to be synthesized; and when DNA is much harder to produce than say, electricity, there would be no way to practically “load” the computer, even if the problems would be solved faster once loaded. In particular, the central issue is how erroneous the calculations can be without affecting the final answer.18 Bit-parity is a solved issue in electronic systems, but no known DNA computer offers such a solution to date. There is, at present, a need for PCR

page 451

August 2, 2021

452

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch14

A. Hasson and D. V. Nicolau

and electrophoresis retrieval4 to be used due to trillions of false solutions being present in the DNA solution.19 Therefore, it turns out to be hard to verify the solution is in fact correct. However, DNA sequencing and other interests in DNA technology have led to improvements for specific forms of DNA computing.20 Overall, although technological progress in DNA computing has greatly slowed in the last decade, the field has contributed greatly to our understanding of the power of unconventional, massively parallel computing. Quite conceivably, improvements in readout or error-correcting technologies could bring about a thaw in the “DNA computing winter”. 14.3. Computation with Slime Mould 14.3.1. Introduction to slime moulds Slime mould (and related) computing is a young, diverse and, to some extent, fragmented field. Slime moulds (often mold) are a group of eukaryotes with distinct phylogenetic difference, though all are capable of single cell life-cycles and forming multicellular structures.d Their unique “motility” characteristics elegantly allow the solution of certain spatial search problems. Slime moulds are highly adaptive to their environment, suggesting their use as excellent agents of optimization. Due to this, they are sometimes seen as a prototype of what will become “true” amorphous biological computers.3, 21–24,d 14.3.2. Slime moulds for computational tasks Often having complex lifecycles such as P. polycephalum, slime moulds have been the subject of diverse efforts at solving toy computational problems.d The appeal of slime mould for this type of task is partially through its lifecycle, as in the case of P. polycephalum during the amassment of plasmodium.25–28 As summarized in Sun,28 during this stage the slime mould exhibits a number of characteristics that are useful for performing computation. These behaviors are d

Biology of the Physaran polycephalum Plasmodium: Preliminaries.

page 452

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computation By Biological Means

b4205-v1-ch14

453

biologically encoded and have been refined through evolution. They include: an ability to find the shortest path,23 the construction of efficient networks,27 constant adaptation to environment stimuli,26, 28 demonstration of a memory-like ability24 and a distributed/parallel approach to “finding solutions”. These properties and others are encouraging,28 suggesting that much of the “hardware” would already have been worked out in computing systems based on these creatures. 14.3.3. Amoeboid organisms as a computer Andrew Adamatzky has shown species such as P. polycephalum are capable of producing approximate solutions to spatial problems such as those involving networks.21 Through the work of Adamatzky and others, the innate ability of P. polycephalum to solve shortest-path problems has been shown. An algorithm to efficiently solve these problems must then reside within the biological workings of the organism, hopefully in a form that could be exploited for real-world implementation. 14.3.4. Slime moulds as a form of computation Adamatzky led a team of researchers (2013) in an effort to achieve heterogeneous computation with slime mould.22 The aptly named “PhyChip” project provided the fundamental research that would be required for real on board computing techniques. Interfaces between electrical (digital) systems and the biological material were created. Technology such as logical gates and wires were created with the slime mould, leading onto parallel computing techniques and even the creation of oscillators25, 29, 30 and logical gates.31,d It would be reasonable to presume that the abstraction capable with slime mould is going to continue to grow, eventually leading to real biologicaldigital computers. 14.3.5. Select applications of slime moulds Adamatzky and colleagues showed in Ref. [32] that bio-inspired (and, of course, biologically computed) networks can quickly validate the

page 453

August 2, 2021

454

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch14

A. Hasson and D. V. Nicolau

design of current motorways in populated areas, as well as offering better solutions to minimizing traffic and highway sizes. The slime mould biological networks were also shown to be more resistant to both natural and artificial disasters in simulation, comparatively to the real motorways in East London, suggesting real-life, immediate applications for this approach. Biological algorithms, such as those created from slime mould, are also shown to be simple and powerful by Nakagaki et al.23 There are many situations where simulated slime mould computing would be appropriate and for this reason several computationally efficient algorithms have been devised. By simulating the slime mould behavior, the solutions the slime mould would compute can be approximated in an indirect way — potentially offering speedups over digital computing currently. In a similar vein, several models for the flow-network adaptation for amoebae exist.33 14.3.6. Challenges of computing with slime moulds Biological processes are driven by stochasticity. Precision science is unlikely to be performed with slime moulds due to the inherit randomness with their behavior,e as is, in fact, the case with all bio-systems. Through ensemble modeling and statistical analysis of experimental results, mean-field, stable solutions can be observed. Although species such as P. polycephalum converge to the optimal solution, the amount of time taken to reach a steady state can vary wildly, depending on the scenario. Unlike biological computers that rely on discrete interacting elements, slime mould computing is relatively slow to compute.34 Finally, the amount of physical space taken up by these systems as they grow may ultimately prove prohibitive to scaling, though this has not yet been investigated. 14.4. Computation with Motile Biological Agents Rather than using the foundational information components of living organisms (DNA) or using large single-celled amoeba such e

2P-320 Removal of a rotating wave in true slime mould (The 46th Annual Meeting of the Biophysical Society of Japan).

page 454

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch14

Computation By Biological Means

455

as slime moulds, problems can be solved by performing agentbased calculation directly, using directed/organized movement of biological “agents”. These agents are often self-propelled motile biological elements, such as actin filaments moving on myosindecorated carpets or bacteria. In all cases, these systems are capable of propulsion through molecular motors — be they flagella5 or other. Alongside DNA computing, mobile biomolecules were the only forms of biocomputing to be shown to be capable of performing computations equivalent to those of Turing machines.35, 36 Compared to other forms of biological computers, this particular area has seen the most recent research interest.37 14.4.1. Molecular motors Protein motors are machines capable of directly transferring chemical energy to mechanical energy.5 Certain motor systems, such as those utilizing actin filaments and microtubules, can be utilized to create biosensing machines, nanomechanical machines and information storage and processing devices.38 Molecular motors are the means by which recent biological computer designs work on a lower level, due to bacteria or amoeba requiring movement; but just the motor mechanisms themselves can be used to demonstrate computing ability, including for “hard” computational problems. 14.4.2. Use of biological motion in confined structures In nature many organisms rely on exploratory behavior for survival. Studies by, for instance, Perumal and colleagues show that Escherichia coli K12-wt displays the ability to deploy complex and efficient algorithms to explore the search-space of a maze.39 The bacteria demonstrated the ability to use a wall-following algorithm (commonly known as the left-hand rule) to efficiently explore the maze — demonstrating the ability for agents such as bacteria to be used for computational tasks usually associated with humans or animals (e.g., mice). Biological agents, outside of bacteria, are shown to be suitable for spacial problem solving.5, 39 Furthermore, problems like mazes can be reformulated as graphs, which akin to the HPP are inefficient or impossible to solve scalably on sequential electronic

page 455

August 2, 2021

456

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch14

A. Hasson and D. V. Nicolau

computers, but able to be solved efficiently and swiftly with massively parallel systems. Nicolau et al.40 demonstrated the use of molecular motors as independent agents to explore and ultimately solve a subsetsum problem (SSP) instance in parallel. In contrast, Hameroff and Rasmussen41 (pre-dating DNA computing) showed how microtubules could be used in neuronal structures for information processing. This work in neuronal-based methods would be extended upon by Priel42 decades later, showing that the human body is utilizing the dendritic cytoskeleton for computation. Both methods rely on microtubules and actin filaments as the crucial components of the biocomputational system. 14.4.3. Issues with mobile biologicals As with DNA computing, the primary issue arising from the use of biological motion in confined structures is that the accuracy of the system is determined by the stochasticity allowed by the structure.40, 43 Considering the engineered assays in Nicolau et al.,40 due to construction limitations there is a chance that mobile agents can take “forbidden” moves and this ultimately leads to extra analysis being required to verify solutions. However, very recent engineering studies44 suggest that these systems may in fact scale successfully by using modern nanofabrication technique to produce completely or almost completely error-free networks for exploration, reducing or eliminating these issues. 14.5. Computation with Synthetic Biology Another approach to exploit the rich detail in biological systems is through the engineering of cells, often referred to as “synthetic biology”. Since cells can do things that modern supercomputer simply cannot do, cellular systems have already de facto achieved “biosupremacy”. Unlike other methods for biological computing, there is significant discourse around the ethics of synthetic biology for computation.45

page 456

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch14

Computation By Biological Means

457

14.5.1. Chemotaxis Chemotaxis is the movement of an organism in response to shifts in chemical gradients in the environment. It is a means by which organisms can react to complex changes in their environment, optimizing sensing and motility capabilities for given optimization problems they face, such as foraging for nutrient.46 The degree of influence the environment has on the behavior of an organism is linked to the environment it has evolved in and adapted to. Environments with stable conditions support the development of long-term memory47 and chemotactic behaviors, whereas unpredictable environments lead to stochastic and highly adaptive behaviors in the organism.48 14.5.2. Saccharum saccharomyces and genetic toggles Describing biology with machines analogies is commonplace and modern biological circuits still draw on the homology seen between mechanical sciences and cellular mechanisms of computation.49 Akin to the abstraction seen in the work of Adamatzky, synthetic biology has progressed to a stage where abstract computational devices exist, such as a genetic toggle.50 This abstraction is likely to lead to increasingly complex computational systems. There is also the possibility of utilizing abstraction within the biologicals that is absent from traditional computing, such as a cell’s ability to operate on different timescales and thus ultimately encoding information in different frequencies.47 By extending the basic units of computation to a continuous space, it is believed that biological computers will be able to execute more efficient algorithms, in terms of space, energy and components.49 14.5.3. Cellular computation Cellular computation allows for fine, continuous computation — unlike the binary realm electronic computers are constrained to.51 Functional components already exist within organisms,52 such as clocks,53 the behavior of which varies with the organism the (often genes) biological information comes from. Sensing componentry and functional elements, such as a circadian clock, can be transplanted

page 457

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch14

A. Hasson and D. V. Nicolau

458

between organisms.54 This transplant ability promises the creation of complex biological circuitry from synthetic biology alone. Complex cellular computation can also arise from abstraction of simpler biological computers, shown by the work of Head et al.55 in their DNAplasmid computation. In these studies, the line between methods of biological computing is at times blurred. 14.5.4. Multicellular machines Complex devices, ranging from biological edge-detection56 to advanced Boolean circuits,57 have been created with synthetic biology. The biosupremacy displayed over conventional computing in these systems is often due to the efficiency increase and ability to compute advanced functions with less parts or components.57 14.6. Differences in Biological Computing Paradigms The four major paradigms in biological computing share many features. All models attempt to solve problems that are difficult or impossible in conventional computing settings. Although all relying on biological processes, the primary computing methods are all distinct, drawing on different properties of biological systems. Similarities can be drawn between DNA computing and motile agentbased computing in that they are both “molecular”, though motile agents may be able to adapt to the environement, which encodes the problem of interest.43 Tables 14.1 and 14.2 summarize important differences in terms of computing paradigms. Table 14.1. Computation method DNA computing Slime mould Motile agent Synthetic biology

Comparison of means of biological computation. Problems solved HPP,12 TSP, Language Theory13 HPP/TSP3 Subset sum/HPP40 Biological circuitry50

Architecture

Implementation*

Discrete

Discrete strings

Continuous Discrete Both

“Agent-based” Agent-based Varies

page 458

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch14

Computation By Biological Means Table 14.2. Computation method DNA computing Slime mould Motile agent Synthetic biology

459

Implementation* specific information. Abstraction limits

Adaptive reasoning

Strings (1D structure) Size of mould None described Cell to organism

N/A Demonstrated29 Depends on agent Described49

Scalability High Low High Low

Table 14.1 describes the architecture and common implementation choices for the primary biological computing methods. One key observation is that although slime moulds are individual agents, they live and (therefore) compute in continuous time. Table 14.2 describes the implementation-specifics of the most common implementation for each computation method. DNA computing is limited by the string structures of the DNA (1D chains/strings). DNA also lacks the ability to “adapt” to the problem instance; however, the raw information storage of DNA is vastly superior to all methods. Slime moulds, many motile agents, and the primary implementations with synthetic biology are all capable of “adaptation” to the problem at hand but with varied levels of “programmability” currently demonstrated. Though these tables are a very general attempt at contrasting these techniques, it can be said the four major paradigms in biological computing share many features. All models can solve a problem that is difficult in conventional computing, in principle. Although all rely on biological processes, the primary computing methodologies are all distinct. Similarities can be drawn between DNA computing and motile agent based computing, at least in the latter case in its most recent incarnation, both relying on molecular agents.43 Motile agent based computing seems to be constrained by the physical space being explored. The problem must be embedded in a spatial representation, which may or may not scale well with the problem instance (this is currently unknown in the general case). All other methods allow for varying degrees of abstraction in a computational sense, for example, algorithms can be intrinsically encoded in

page 459

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

460

b4205-v1-ch14

A. Hasson and D. V. Nicolau

biological circuity, combinatorial problems can be encoded in DNA, and so on. On the other hand, networks-based biocomputation can encode not only the problem instance, but also an efficient classical algorithm into the structure, as with dynamic programming in the work of Nicolau and colleagues. As shown in Figure 14.1, solutions to hard problems can be derived using multiple biological facets. The selected problem of the HPP can be abstracted upon for generalization of the principles. In the case of conventional (digital) computing, often one algorithm is enough to solve a problem until a more optimized algorithm is discovered. Biological computers yield the option to have inherently different alphabets for algorithms, allowing for pseudoprogramming unlike digital computing and ultimately leading to novel solutions that far exceed the scalability and performance of digital computation. 14.7. Discussion 14.7.1. Critical research gaps in biological computing The financial cost associated with biological work is much greater than that of the costs of conventional computer science.59 Therefore, the simulation or modelling of the above areas in biological computing would provide significant insight into how well they can perform. Robust models for BCs, which can be used for comparing electronic computation, would likely improve the research outcomes for BC experimental work. Models would likely be as diverse as the biocomputers themselves and would obviously try to realize the intrinsic biological means which the biological computer exploits to compute — motile biological agents would likely be modeled with an agent-based scheme for example. Ideally, the problems which are going to be solved with BCs can be modeled with ECs, but not at a speed or magnitude that can only be accomplished with BCs. Although far from the current research capabilities of BCs, it is possible to model Turing ECs with BCs. There appears to be nothing but research effort halting electronic computers being simulated

page 460

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computation By Biological Means (a)

(d)

(c)

(b)

Figure 14.1.

(Continued)

b4205-v1-ch14

461

page 461

August 2, 2021

462

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch14

A. Hasson and D. V. Nicolau

←−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− Figure 14.1. 1H The development of biological computers is driven by the need to solve explicit problems, such as the HPP, that cannot be solved scalably using sequential electronic computers: (A–C) all represent unique biologically encoded means to solve HPP. A demonstrates the DNA computing model described by Adleman. 1a DNA strings that are suitable for the problem must be first acquired. 2a The DNA is then used to encode the locations and implicitly the relationships between them. 3a DNA is mixed with enzymes in a homogeneous solution allowing parallelization of the computation. 4a the solution must be derived from interpreting and manipulating of results. B represents the slime mould approach to computation. 1b the locations are often represented by food sources.25, 26, 58 2b Slime mould is introduced into the system. 3b After enough time, a steady-state optimal solution is derived. 4b Many real-world optimal solutions have been shown for this form of computation.27 C is best described as computation with mobile agents. 1c The problem is encoded into a physical structure, in Nicolau et al.40 the SSP 2c Mobile agents then propagate throughout the structure. 3c After enough time, the results can be directly read out from the exits of the maze encoding the problem instance. D As outlined by Tamsir57 utilizing biological computers creates a great overhead in 1d implementation and 2d interpretation.

with DNA (for instance). There only appears to be difficulties with implementation preventing access to the superior methods of computation enabled with biocomputing. The question of the possibility of computing by biological means has been answered, the ability to design (cost-effective) computation architectures necessary for better programmability or generalpurpose computing is still an open field of research — although some work has been done in this area for DNA computing.60, 61 To provide an example of what is meant by architecture, there is currently no general-purpose programming language for any of the mentioned biological computers; the lack of these tools or support mechanisms hinders future research. The scale of problems solved is also incomparable. There exists no biological “super” computer, as with electronic computers. Demonstrating that a biological computer can solve problems on the magnitude of high-performance computers (e.g., millions if not billions of solutions/inputs) would clearly identify whether BCs have a place in scientific and high-performance computing or will remain a footnote in the history of computing. It is evident the form of which

page 462

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computation By Biological Means

b4205-v1-ch14

463

solutions to the mentioned gaps would take will vary greatly; entirely dependent on the biological system being utilized or modeled. 14.7.2. The case for biosupremacy Biological computers are limited by physical laws, as with any computer.49 However, the energy costs associated with BCs are magnitudes lower than ECs and may thus prove an alternative for problems where the magnitude of complexity requires infeasible amounts of power; essentially some problems may be sooner solved by BCs for this reason. Biological computers are also promising for use in extreme environments. Although this was not touched on in the primary review, Archaea provide options for extremetemperature (hyperthermophile) environments where ECs cannot operate (≥110◦ C). Aside from fundamental issues of computational complexity (P vs. NP and others) necessitating the development of massively parallel computing devices, there are more immediate reasons to hope for progress in biocomputation. As traditional scalability of electronic computers, for example, following Moore’s Law, comes to an end, it is imperative that the field of biological computing continues to develop novel technologies and novel insights into the nature of computation itself. References 1. L. N. de Castro, Fundamentals of natural computing: an overview. Phys. Life Rev. 4(1), 1–36 (2007). 2. M. Oltean. Unconventional computing: A short introduction. Studia Universitatis Babes-Bolyai : Series Informatica 54 (2009). 3. A. Adamatzky, Advances in Unconventional Computing: Volume 1: Theory (Springer, 2016). 4. L. Adleman, On Constructing a Molecular Computer DRAFT. 5. D. V. Nicolau, D. V. Nicolau, G. Solana, K. L. Hanson, L. Filipponi, L. Wang, and A. P. Lee, Molecular motors-based micro- and nano-biocomputation devices. Microelectron. Eng. 83(4), 1582–1588 (2006). 6. L. Adleman, Molecular computation of solutions to combinatorial problems. PubMed NCBI. Technical report (1994). 7. G. Paun, G. Rozenberg, and A. Salomaa, DNA Computing: New Computing Paradigms (Springer Science & Business Media, 2005).

page 463

August 2, 2021

464

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch14

A. Hasson and D. V. Nicolau

8. S. Roweis, E. Winfree, R. Burgoyne, N. V. Chelyapov, M. F. Goodman, P. W. K. Rothemund, and L. M. A. Y, A sticker based model for DNA computation. In In Proceedings of the Second Annual Meeting on DNA Based Computers (American Mathematical Society, 1996), pp 1–29. 9. J. Xu, Y. Dong, and X. Wei, Sticker DNA computer model — Part I: Theory. Chin. Sci. Bull. 49(8), 772 (2004). 10. X. Wang, Z. Bao, J. Hu, S. Wang, and A. Zhan, Solving the SAT problem using a DNA computing algorithm based on ligase chain reaction. Biosystems 91(1), 117–125 (2008). 11. N. C. Stellwagen. DNA Gel Electrophoresis. In D. Tietz, editor, Nucleic Acid Electrophoresis, Springer Lab Manual (Springer, Berlin, Heidelberg, 1998), pp. 1–53.. 12. L. M. Adleman, The manipulation of DNA to solve mathematical problems is redefining what is meant by “computation”. Sci. Am. 8 (1998). 13. L. Karl. DNA computing: Arrival of biological mathematics. The Mathe. Intell. 19(2), 9–22 (1997). 14. A. Frutos, Demonstration of a word design strategy for DNA computing on surfaces. Nucleic Acids Res. 25(23), 4748–4757 (1997). 15. R. Freund, L. Kari, and G. Paun, DNA computing based on splicing: The existence of universal computers. Theory Comput. Syst. 32(1), 69–112 (1999). 16. A. Currin, K. Korovin, M. Ababi, K. Roper, D. B. Kell, P. J. Day, and R. D. King, Computing exponentially faster: Implementing a nondeterministic universal Turing machine using DNA. J. Roy. Soc. Interf. 14(128), 20160990 (2017). 17. W. Liu, L. Gao, X. Liu, S. Wang, and J. Xu, Solving the 3-SAT problem based on DNA computing. J. Chem. Inform. Comput. Sci. 43(6), 1872–1875 (2003). 18. Z. Ezziane, DNA computing: Applications and challenges. Nanotechnology 17(2), R27–R39 (2005). 19. Q. Liu, L. Wang, A. G. Frutos, A. E. Condon, R. M. Corn, and L. M. Smith, DNA computing on surfaces. Nature 403(6766), 175–179 (2000). 20. S.-Y. Shin, I.-H. Lee, D. Kim, and B.-T. Zhang. Multiobjective evolutionary optimization of DNA sequences for reliable DNA computing. IEEE Trans. Evolution. Comput. 9(2), 143–158, (2005). 21. A. Adamatzky, Towards fungal computer. Interface Focus 8(6), 20180029 (2018). 22. A. Adamatzky, J. Jones, R. Mayne, J. Whiting, V. Erokhin, A. Schumann, and S. Siccardi, PhyChip: Growing computers with slime mould. In S. Stepney, S. Rasmussen, and M. Amos (eds.), Computational Matter, Natural Computing Series (Springer International Publishing, Cham, 2018), pp. 111–128. 23. T. Nakagaki, R. Kobayashi, Y. Nishiura, and T. Ueda, Obtaining multiple separate food sources: Behavioural intelligence in the Physarum plasmodium. Proc. Roya. Soci. Lond. Ser. B: Biol. Sci. 271(1554), 2305–2310 (2004).

page 464

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computation By Biological Means

b4205-v1-ch14

465

24. J. Vallverdu, O. Castro, R. Mayne, M. Talanov, M. Levin, F. Baluska, Y. Gunji, A. Dussutour, H. Zenil, and A. Adamatzky, Slime mould: The fundamental mechanisms of cognition (2017), arXiv:1712.00414 [cs]. 25. A. Adamatzky, Slime mould electronic oscillators (2014), arXiv:1403.7350 [physics]. 26. A. Adamatzky, Slime mold computing. In R. A. Meyers (ed.), Encyclopedia of Complexity and Systems Science (Springer, Berlin, Heidelberg, 2017), pp. 1–16. 27. A. Tero, R. Kobayashi, and T. Nakagaki, Physarum solver: A biologically inspired method of road-network navigation. Physica A Statist. Mech. Appl. 363, 115 (2006). 28. Y. Sun, Physarum-inspired network optimization: A review (2019), arXiv:1712.02910 [cs]. 29. T. Nakagaki and T. Ueda, Phase switching of oscillatory contraction in relation to the regulation of amoeboid behavior by the plasmodium of physarum polycephalum. J. Theoretical Biology 179(3), 261–267 (1996). 30. A. Adamatzky, Physarum wires, sensors and oscillators. In A. Adamatzky (ed.), Advances in Physarum Machines: Sensing and Computing with Slime Mould, Emergence, Complexity and Computation (Springer International Publishing, Cham, 2016), pp. 231–269. 31. S. Harding, J. Koutnik, K. Greff, J. Schmidhuber, and A. Adamatzky, Discovering Boolean gates in slime mould. (2016), arXiv:1607.02168 [cs]. 32. A. Adamatzky and J. Jones, Road planning with slime mould: If Physarum built motorways it would route M6/M74 through Newcastle. Int. J. Bifurc. Chaos 20(10), 3065–3084 (2010). 33. A. Tero, K. Yumiki, R. Kobayashi, T. Saigusa, and T. Nakagaki, Flownetwork adaptation in Physarum amoebae. Theory Biosci. 127(2), 89–94 (2008). 34. Y. V. Pershin and M. Di Ventra. Memristive and Memcapacitive Models of Physarum Learning. In A. Adamatzky, editor, Advances in Physarum Machines: Sensing and Computing with Slime Mould, Emergence, Complexity and Computation (Springer International Publishing, Cham, 2016), pp. 413–422. 35. Y. Benenson, T. Paz-Elizur, R. Adar, E. Keinan, Z. Livneh, and E. Shapiro, Programmable and autonomous computing machine made of biomolecules. Nature 414(6862), 430–434 (2001). 36. D. Beaver, Computing with DNA. J. Comput. Biol. 2(1), 1–7 (1995). 37. S. Regot, J. Macia, N. Conde-Pueyo, K. Furukawa, J. Kjellen, T. Peeters, S. Hohmann, E. Nadal, F. Posas, and R. Sole, Distributed biological computation with multicellular engineered networks. Nature 469, 207–11 (2011). 38. D. J. Bakewell and D. V. Nicolau, Protein Linear Molecular MotorPowered Nanodevices (2007). 39. A. Perumal. Space Partitioning and Maze Solving by Bacteria SpringerLink.

page 465

August 2, 2021

466

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch14

A. Hasson and D. V. Nicolau

40. D. V. Nicolau, M. Lard, T. Korten, F. C. M. J. M. van Delft, M. Persson, E. Bengtsson, A. Mansson, S. Diez, H. Linke, and D. V. Nicolau. Parallel computation with molecular-motor-propelled agents in nanofabricated networks. Proce. Natl. Acad. Sci., 113(10), 2591–2596 (2016). 41. S. R. Hameroff and S. Rasmussen, Information processing in microtubules: Biomolecular Automata and Nanocomputers. In F. T. Hong, Molecular Electronics: Biosensors and Biocomputers (Springer US, Boston, MA, 1989), pp 243–257. 42. A. Priel, J. A. Tuszynski, and H. F. Cantiello. The Dendritic Cytoskeleton as a Computational Device: An Hypothesis. In J. A. Tuszynski, The Emerging Physics of Consciousness, The Frontiers Collection (Springer, Berlin, Heidelberg, 2006), pp. 293–325. 43. D. V. N. Jr, K. Burrage, and D. V. Nicolau. Computing with motile bio-agents. In Biomedical Applications of Micro- and Nanoengineering III, vol. 6416 (International Society for Optics and Photonics, Dec. 2006), p. 64160S. 44. F. C. M. J. M. van Delft, G. Ipolitti, D. V. Nicolau, A. Sudalaiyadum Perumal, O. Kaspar, S. Kheireddine, S. Wachsmann-Hogiu, and D. V. Nicolau, Something has to give: scaling combinatorial computing by biological agents exploring physical networks encoding NP-complete problems. Interface Focus 8(6), 20180034 (2018). 45. H. Konig, D. Frank, R. Heil, and C. Coenen, Synthetic genomics and synthetic biology applications between hopes and concerns. Curr. Genom. 14(1), 11–24 (2013). 46. A. Celani, Bacterial strategies for chemotaxis response PNAS. 47. N. Vladimirov, and V. Sourjik, Chemotaxis: how bacteria use memory. Biol. Chemi. 390(11), 1097–1104 (2009). 48. M. P. Neilson, D. M. Veltman, P. J. M. v. Haastert, S. D. Webb, J. A. Mackenzie, and R. H. Insall. Chemotaxis: A feedback-based computational model robustly predicts multiple aspects of real cell behaviour. PLOS Biol. 9(5), e1000618 (2011). 49. L. Grozinger, M. Amos, T. E. Gorochowski, P. Carbonell, D. A. Oyarzun, R. Stoof, H. Fellermann, P. Zuliani, H. Tas, and A. Goni-Moreno, Pathways to cellular supremacy in biocomputing. Nature Commun. 10(1), 1–11 (2019). 50. T. S. Gardner, C. R. Cantor, and J. J. Collins, Construction of a genetic toggle switch in Escherichia coli. Nature 403(6767), 339–342 (2000). 51. R. Daniel, J. R. Rubens, R. Sarpeshkar, and T. K. Lu, Synthetic analog computation in living cells. Nature 497(7451), 619–623 (2013). 52. L. Barsanti, V. Evangelista, P. Gualtieri, V. Passarelli, and S. Vestri (eds.), Molecular Electronics: Bio-sensors and Bio-computers. Nato Science Series II. (Springer, Netherlands, 2003). 53. M. Ishiura, S. Kutsuna, S. Aoki, H. Iwasaki, C. R. Andersson, A. Tanabe, S. S. Golden, C. H. Johnson and T. Kondo, Expression of a gene cluster kaiABC as a circadian feedback process in cyanobacteria, Science (New York, N.Y.), 281(5382), 1519–1523 (1998).

page 466

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computation By Biological Means

b4205-v1-ch14

467

54. A. H. Chen, D. Lubkowicz, V. Yeong, R. L. Chang, and P. A. Silver, Transplantability of a circadian clock to a noncircadian organism. Sci. Adv. 1(5) (2015). 55. T. Head, G. Rozenberg, R. S. Bladergroen, C. K. Breek, P. H. Lommerse, and H. P. Spaink, Computing with DNA by operating on plasmids. Bio Syst. 57(2), 87–93 (2000). 56. J. J. Tabor, H. M. Salis, Z. B. Simpson, A. A. Chevalier, A. Levskaya, E. M. Marcotte, C. A. Voigt, and A. D. Ellington, A synthetic genetic edge detection program. Cell 137(7), 1272–1281 (2009). 57. A. Tamsir, J. J. Tabor, and C. A. Voigt, Robust multicellular computing using genetically encoded NOR gates and chemical ‘wires’. Nature 469(7329), 212–215 (2011). 58. A. Adamatzky, and T. Schubert, Slime mold microfluidic logical gates. Mater. Today 17 (2), 86–91 (2014). 59. N. R. C. U. C. O. R. O. I. Biology, Biology Research Infrastructure and Recommendations. (National Academies Press, US, 1989). 60. N. Jonoska, S. A. Karl, and M. Saito, Three dimensional DNA structures in computing. Biosystems 52(1), 143–153 (1999). 61. C. Dwyer, J. Poulton, R. Taylor, and L. Vicci, DNA self-assembled parallel computer architectures. Nanotechnology 15(11), 1688–1694 (2004).

page 467

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch15

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 0015

Chapter 15

Swarm and Stochastic Computing for Global Optimization Xin-She Yang School of Science and Technology, Middlesex University London, The Burroughs, London NW4 4BT, UK Many problems in data mining and machine learning are related to optimization, and optimization techniques are often used to solve such problems. Traditional techniques such as gradient-based methods can be efficient, but they are local optimizers. For global optimization, alternative approaches tend to be nature-inspired metaheuristic algorithms. We introduce some of the nature-inspired optimization algorithms with the emphasis on their main characteristics. We also highlight the role of algorithmic components in such algorithms, and then we conclude with a brief discussion about some open problems.

15.1. Introduction Optimization is important in many applications, from engineering designs to business planning and from vehicle routing to deep learning. After all, the goal of designs and planning may need to maximize efficiency, accuracy, performance, profit, sustainability and to minimize costs, wastage, energy consumption, travel time or distance, environmental impact and others. Though such optimization metrics may not always be achievable, we still try to find solutions that may be close to optimality or practically acceptable good solutions so as to maximize or minimize the objectives as much as possible. However, in many cases, we may not be able to define the objective explicitly or rigorously, but we may still want to optimize it. For example, we want the high-quality service during a holiday, 469

page 469

August 2, 2021

470

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch15

X.-S. Yang

but it would be difficult to define the quality of service exactly because different people may have different expectations for the same type of services. To be mathematically exact, we will assume that all objectives in this work can be explicitly defined and all constraints are also mathematically explicit. In addition, almost all optimization problems are subject to constraints, which is especially true for real-world problems. In fact, both the objective and constraints can be highly nonlinear, leading to multimodal optimization problems, which make it more difficult to solve such problems. Despite the huge efforts in research in both theory and practice, we can only solve three types of optimization problems efficiently: linear programming, convex optimization, and transformable problems.1 Linear programming concerns optimization problems with both linear objectives and linear constraints, thus efficient methods such as simplex methods exist. Convex optimization is a special class of optimization problems with convex objectives and convex domains, thus the optimality is global and unique with efficient algorithms.1 The third type of optimization is the type of problem that we can somehow convert, by transforming and/or reformulating the problem, into the previous two types (either linear or convex). For example, the method of least-squares essentially formulates nonlinear curve-fitting problems as a convex optimization method problem. Thus, we can use efficient algorithms such as Gauss– Newton methods.1, 2 Apart from these three types, it seems that most problems are nonlinear and thus difficult to solve. Obviously, approximation methods such as quadratic programming, sequential quadratic programming, trust-region methods, interior-point methods and others exist and can be effective.3 However, there is no guarantee that the global optimality is achievable. Recent trends tend to use evolutionary algorithms, stochastic heuristic algorithms and nature-inspired algorithms to deal with highly nonlinear problems. Nature-inspired algorithms, such as differential evolution, particle swarm optimization, cuckoo search, and firefly algorithms, are simple, flexible, and yet efficient to solve a wide range of real-world problems.4, 5 However, they do have some issues such as higher computational costs and lack of

page 470

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch15

Swarm and Stochastic Computing for Global Optimization

471

mathematical rigour. Therefore, this work summarizes and highlights the latest developments in this area as alternative computing for global optimization. 15.2. Optimization Almost all optimization problems can be formulated as m objectives f (x) = [f1 (x), f2 (x), . . . , fm (x)],

(15.1)

subject to M equality constraints and N inequality constraints hi (x) = 0,

(i = 1, 2, . . . , M ),

(15.2)

gj (x) ≤ 0,

(j = 1, 2, . . . , N ),

(15.3)

where x is a vector of design variables in the D-dimensional search space. That is x = (x1 , x2 , . . . , xD )T ∈ RD .

(15.4)

In the special case when m = 1, we have the standard single-objective optimization, which will be our focus here in this chapter. Thus, we will write the objective as a single objective f (x). In general, all the functions f (x), hi (x) and gj (x) are nonlinear, and we have nonlinear optimization problems, which is usually the case in engineering design optimization.6, 7 Though there are different algorithms for optimization,1, 8 the most widely used class of techniques is gradient-based, and a fundamental example is the Newton method or Newton–Raphson method. Starting with an initial solution x0 , Newton’s iterative formula for optimization can be written as xt+1 = xt −

∇f (xt ) ∇f = xt − , 2 ∇ f (xt ) H

where ∇f is the gradient vector ∂f T ∂f ∂f , ,..., , ∇f = ∂x1 ∂x2 ∂xD

(15.5)

(15.6)

page 471

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch15

X.-S. Yang

472

and H is the Hessian matrix ⎛ ⎜ ⎜ H=⎜ ⎝

∂2f ∂x21

.. .

... .. .

∂2f ∂xD ∂x1

...

∂2f ∂x1 ∂xD

.. .

∂2f ∂x2D

⎞ ⎟ ⎟ ⎟. ⎠

(15.7)

The increment Δxt or step size of the iteration requires both first and second derivatives xt+1 = xt + Δxt ,

Δxt = −

∇f , H

(15.8)

however, the computation of such derivatives can be very expensive if the dimensionality D is much higher. In addition, the inverse of the Hessian (i.e., H −1 ) can be computationally expensive as well.3 Thus, the use of exact Hessian is not desirable in practice. Approximations of the Hessian are usually used, which leads to a class of quasi-Newton with various ways of approximating H. Obviously, the simplest way is to use H = λI where I is the identity matrix of the same size as H and λ should be the largest eigenvalue of H. However, the largest eigenvalue is not known and can also be computationally expensive to calculate. Therefore, a further approximation is to use a fixed parameter α = O(1/λ), which means H −1 = αI. Therefore, the iteration formula becomes xt+1 = xt + Δxt ,

Δxt = −α∇f,

(15.9)

where α is often referred to as the learning rate. The move or step is moving along the direction of the gradient. For minimization problems, the move to better solutions is down the gradient, leading to the steepest gradient descent method. For maximization problems, it climbs up the gradient, leading to the so-called hill-climbing. Though this formula looks simple, it forms a fundamental building block for many algorithms. Newton’s method and its variants can be very efficient and are thus widely used in many applications. However, they do have a serious disadvantage. They are local optimizers and the final solutions obtained can largely depend on their initial starting points.

page 472

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch15

Swarm and Stochastic Computing for Global Optimization

473

Therefore, they are not suitable for global optimization because most nonlinear global optimization problems are multimodal with many local optima. Gradient-based methods tend to get stuck at some local optima, thus limiting the possibility of finding the true global optimality.

15.3. Stochastic Enhancements As we pointed out earlier that the gradient-based methods such as Equation (15.8) can be dependent on the initial guess x0 , modifications are needed to increase the probability of finding the global optimality. One simple and yet effective remedy is to use the so-called random restart. The main idea is to run the local optimizer such as hill-climbing multiple times, and each time it starts with a different initial solution x0 . If all randomized restarting points can somehow cover a very large region in the search space, it can be expected that global optimality can be reached by at least some of the iterations. In essence, such random restart is a Monte Carlo method in the sense that x0 is sampled randomly in the search space, which indeed belongs to the class of Monte Carlo methods.9 However, due to the dependence of the final solution on the initial point, this method does not belong to the class of Markov chain Monte Carlo (MCMC) methods.10, 11 MCMC methods are very powerful methods for sampling and optimization, and many nature-inspired algorithms can loosely be considered as different ways of realizing the MCMC. Looking at the random restart hill-climbing from a different perspective, it is a heuristic approach because it is essentially an approach by trial and error. Heuristic approaches are powerful methods for many applications.12–14 Loosely speaking, an effective global optimizer should have two key components: exploration and exploitation,15 or diversification and intensification. Exploration or diversification enables the algorithm to search a large region of the search space more efficiently, increasing the probability of finding the global optimality. This can be achieved by simple uniform randomization or more sophisticated

page 473

August 2, 2021

474

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch15

X.-S. Yang

probability distributions. On the other hand, exploitation or intensification uses the information found during the iterations such as the gradients so as to reduce the number of iterations and speed up the convergence. Too much exploration and too little exploitation will usually generate solutions that may be far away from the existing solution, thus potentially these solutions are not feasible. As a result, the convergence is slow, though the probability of finding the global optima increases. On the other hand, too much exploitation and too little exploration make the solutions focus on a local region and the modifications can be small. Consequently, the convergence can usually increase, but it may reduce the probability of finding the global optima. Therefore, a fine balance is needed. However, how to achieve such balance is still an open problem. Looking from a different perspective, we can consider an algorithm as a self-organized system. In the physical systems such as crystallization and pattern formation, the state of a system can be transformed into a more organized system under the right condition. In many case, this transformation can be from a far-fromequilibrium state to a self-organized state, leading to the so-called self-organization.16 For algorithms to iterate, the initial solution (or state) is usually very different from the final optimal solution (converged state). The iterations and selection of solutions somehow may mimic certain characteristics of self-organization. In this sense, an algorithm can act as a mechanism to drive the solutions/states from far from optimality into a region or state near or at optimality.2 So far we have essentially divided the algorithms into two main classes: deterministic and stochastic. Deterministic algorithms such as the Newton–Raphson method have a fixed path once we have chosen a starting point. There is no randomness, and we should get the same set of solutions every time we run the algorithm. Stochastic algorithms such as the hill-climbing with random restart do not have a fixed path. Due to their intrinsic randomness, the path is not repeatable. However, if multiple runs are allowed, the final solutions may be approximately the same within a given tolerance in the statistical sense.

page 474

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch15

Swarm and Stochastic Computing for Global Optimization

475

In contrast with the deterministic algorithms, stochastic algorithms do have certain advantages such as the higher probability of finding the global optimality as we have briefly discussed. In fact, in the current trends in optimization and machine learning, stochastic algorithms form a significant part of the latest algorithms. Therefore, we now focus mainly on stochastic algorithms. If we analyze stochastic algorithms further, we can divide them into evolutionary algorithms and nature-inspired algorithms. For evolutionary algorithms, we have genetic algorithms, evolutionary strategy and others. For nature-inspired algorithms, we can even subdivide them into swarm intelligence (SI)-based algorithms, such as particle swarm optimization, and non-SI-based algorithms, such as simulated annealing. 15.4. Evolutionary Computation There are quite a few algorithms for optimization that use Darwin’s theory of evolution as a model and the genetic algorithm is a representative example.17 In fact, there are so many variants of genetic algorithms (GA), which, together with evolution strategy and genetic programming, form the so-called evolutionary computation.18, 19 GA was developed by John Holland in the 1960s, it uses three genetic operators: crossover, mutation, and selection.17, 20 Once a real-vector solution x is converted into a binary string, called a chromosome x → [1, 0, 1, 1, 0, 0, 0, . . . , 0, 1, 0],

(15.10)

cross-over is the main action to generate two offspring solutions from two parent solutions by swapping one or often multiple parts of one solution string xi with their corresponding parts of another solution string xj . Mutation modifies a chromosome by flipping one bit or often multiple bits from 1 to 0 or 0 to 1, which will generate a new solution. At each generation, the solutions of the fittest will be selected to pass on to the next generation, and, in some GA variants, the reproduction probability of an individual solution may be directly related to its fitness, leading to a certain form of elitism. In GA,

page 475

August 2, 2021

476

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch15

X.-S. Yang

crossover is usually carried out with a higher probability, typically in the range of 0.6 to 0.95, whereas mutation is carried out with a relatively lower probability, often in the range of 0.001 to 0.05. Such typical settings of parameters can lead to a higher degree of mixing and exploitation ability with a moderate or low degree of exploration at late iterations. In practice, GA can converge well and the global optimality can be achieved in many applications. Though not directly related to evolution, the Tabu search is an interesting approach that uses tabu lists to let the algorithm avoid certain types of past solutions, and thus potentially guide the search towards new regions.21 On the other hand, genetic programming is a technique to evolve computer codes using evolutionary/genetic operators.22 GA was initially designed for single-objective optimization, and it has been extended to solve multi-objective optimization problems. For optimization problems with multiple objectives, Pareto optimality is used and Pareto fronts are sought. For example, in the non-dominated sorting genetic algorithm (NSGA-II),23 GA is first used to find a set of non-dominated solutions, and these nondominated solutions are sorted in terms of their ranks and crowd distances. The main aim is to sort and select such solutions so that they can distribute on the Pareto front relatively uniformly. There are many different approaches for multi-objective optimization in the literature, forming a set of multi-objective evolutionary algorithms.24

15.5. Nature-Inspired Computing Nature-inspired algorithms for optimization form a class of natureinspired metaheuristics.2, 4, 25 Nature-inspired algorithms usually draw inspirations from nature and try to mimic certain successful characteristics of biological systems and physical laws in nature. These algorithms are simple, flexible and yet sufficiently effective in solving problems in optimization and computational intelligence with a wide range of applications. There are two main categories of nature-inspired algorithms: trajectory-based and population-based. Trajectory-based algorithms

page 476

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch15

Swarm and Stochastic Computing for Global Optimization

477

such as simulated annealing use a zigzag path to represent the search in the design space, whereas population-based algorithms such as ant colony optimization use a set of multiple agents to carry out search simultaneously in parallel. Most algorithms use populationbased approaches because they tend to be more efficient, comparing with trajectory-based approaches. In fact, most population-based algorithms nowadays belong to the swarm intelligence (SI)-based algorithms because these algorithms try to mimic certain swarming behavior and attempt to achieve certain level of collective intelligence. Not all algorithms are SI-based, even though they may use a population of different solutions. Therefore, we can also divide nature-inspired algorithms into two categories: SI-based and nonSI-based algorithms. 15.5.1. Non-SI-based approaches Quite a few algorithms belong to this class. We will briefly outline the key ideas of each algorithm. • Simulated annealing (SA) is a trajectory-based and non-SI-based algorithm, which simulates the behavior of metal annealing.26 For a minimization problem with an objective function f (x), the probability of a search move being accepted or not is governed by a Boltzmann-style distribution p(xi → xj ) ∼ exp[− max{0, f (xj ) − f (xi )}/T ],

(15.11)

where T is the temperature. In case of T → 0, only the move that reduces the objective value (for minimization) will be accepted. This becomes a gradient-based algorithm, similar to Newton’s method. Mathematically, SA has been shown to have good convergence if the temperature is lowered gradually and the algorithm has enough moves at each temperature level.27 • Differential evolution (DE) is a population-based algorithm,28 but it is not SI-based. There are two main mechanisms in DE for generating new solution vectors. One main mechanism is the mutation operator to modify a solution xtr at iteration t to form a

page 477

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch15

X.-S. Yang

478

so-called donor solution v i vi = xtr + F (xtp − xtq ),

F ∈ (0, 2),

(15.12)

where F is the differential weight controlling the strength of the mutation. Three solutions with different indices r, p and q should be different and can be selected by random permutation from the population of n solutions. The other main step is the crossover with a crossover probability Cr and is carried out for each component j where j = 1, 2, . . . , D in the D-dimensional space. This is achieved by generating a uniformly distributed random rj ∈ [0, 0] for each component j of xti (denoted by xti,j ). The component is updated with the component of the donor vector v i,j is rj ≤ Cr , otherwise, the component remains unchanged. That is

v i,j , if rj ≤ Cr , (15.13) ui,j = xti,j , otherwise. ← ui if The selection of the new solutions is carried out xt+1 i t+1 t t f (ui ) ≤ f (xi ), otherwise, xi ← xi . Depending on the ways of using the solutions and how to generate new solutions, there are many different variants of DE.28, 29 For example, the self-adaptive DE attempts to vary Cr randomly in the range of [0.1, 0.9].30 DE has also been extended to solve multi-objective optimization problems. For a detailed review, please refer to Das et al.31 • Eagle strategy (ES) is a hybrid algorithm that combines two different algorithms at two different stages. The first crude stage uses a randomization technique to mimic eagle flights and search characteristics, and then a local optimizer is used to speed up the convergence.32 15.5.2. SI-based approaches There are many algorithms that are based on swarming behavior of birds, fish, insects and other swarms.33 These algorithms form a vast majority of swarm intelligence-based algorithms. We now briefly highlight some of the key ideas of a few nature-inspired algorithms for optimization.

page 478

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch15

Swarm and Stochastic Computing for Global Optimization

479

• Ant colony optimization (ACO) mimics the collective behavior of social ants to carry out optimization tasks.34, 35 ACO uses the pheromone concentration as a measure of the fitness of a route or path. Each ant will deposit a fixed amount or rate of pheromone as it moves and the pheromone will evaporate at a constant rate. Selection of a route at a junction of a network depends on the pheromone concentration of a path in addition to some probabilistic components. As the system evolves, certain routes are more preferable, leading to some converged path or solution for an optimization problem. • Bees-based algorithms typically do not use pheromone, though they try to mimic certain foraging behavior of honeybees. For example, a honeybees-based approach was used for dynamic server allocation,36 while artificial bee colony uses a combination of scout bees and foraging bees.37, 38 Under appropriate conditions, such bees-based algorithms can be useful for solving certain class of optimization problems. • Particle swarm optimization (PSO) is a swarm intelligence-based optimizer,39, 40 which uses a swarm of particles. Each particle i has a velocity v i and a position vector xi at iteration t. They are updated iteratively by = v ti + α1 (g ∗ − xti ) + β2 (x∗i − xti ), v t+1 i

(15.14)

= xti + v ti Δt, xt+1 i

(15.15)

where α, β > 0 are learning parameters, and 1 , 2 are random numbers drawn from a uniform distribution [0,1]. Among the population of n solutions, there is a current best solution g ∗ and each particle has a historical best solution x∗i . The step increment Δt = 1 can be used because the iteration is discrete. Since we are more concerned with the mathematical formulas of an algorithm, we treat all quantities as dimensionless variables, so Δt = 1 can be used and thus omitted in all relevant formulas. It has been shown that PSO can have good convergence behavior under certain conditions using the dynamical system theory.41 PSO has also been extended to solve multi-objective optimization.42

page 479

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch15

X.-S. Yang

480

• Firefly algorithm (FA) is also a swarm intelligence-based algorithm, which mimics the flashing behavior of tropical fireflies.43 The objective of an optimization problem is encoded as light and converted into a light-based landscape, and fireflies swarm and attract according to their attractiveness 2

= xti + β0 e−γrij (xtj − xti ) + αti , xt+1 i

(15.16)

where β0 is the attractiveness at zero distance and α is a parameter controlling the strength of the randomization term with ti being drawn from a normal distribution. rij is the Cartesian distance between firefly i at xi and firefly j at xj . In addition, γ > 0 is a scaling parameter that controls the visibility of a firefly. A lower value of γ means higher visibility. FA has been shown to be effective for solving multimodal optimization problems. FA has also been extended to many variants, including the chaotic FA,44, 45 the discrete FA,46 multiobjective FA47 and others.48 FA and its variants have been applied to many applications such as clustering and image processing49 and swarm robots.50, 51 For more details, please refer to some recent comprehensive reviews.48, 52, 53 • The bat algorithm (BA) is a population-based algorithm,54 inspired by the echolocation and frequency-tuning characteristics of microbats. A bat is associated with a position xti (or a solution vector) and a flying velocity v ti as well as a range of frequency fi . We have fi = fmin + (fmax − fmin )β,

(15.17)

v ti

(15.18)

xti

=

v t−1 i

=

xt−1 i

+

(xt−1 i

+

vti ,

− x∗ )fi ,

(15.19)

where the uniformly distributed random number β ∈ [0, 1] simulates the frequency tuning. This is also enhanced by switching between exploration and exploitation by using the variations of loudness and pulse emission. BA has also been extended to different variants such as chaotic BA,55 discrete BA,56, 57 and directional bat algorithm58 and others. In addition, rigorous mathematical

page 480

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch15

Swarm and Stochastic Computing for Global Optimization

481

analysis using dynamical system theory has shown the global convergence of BA.59 • Cuckoo search (CS) is a nature-inspired algorithm,60 which was inspired by the cuckoo-host co-evolution and the brooding parasitism of some cuckoo species. Cuckoos lay eggs in the nests of host birds and let the host birds hatch the eggs. Some eggs laid by cuckoos will be discovered and abandoned, typically with a probability pa of about 1/4 to 1/3.61, 62 There are two search mechanisms, enhanced by Lévy flights. The main local search mechanism is = xti + αs ⊗ H(pa − ) ⊗ (xtj − xtk ), xt+1 i

(15.20)

where α is a parameter and s is the step size. xtj and xtk are two different solutions, whereas H(pa − ) is the Heaviside function with a switch probability pa and a uniformly distributed random number ∈ [0, 1]. It is worth pointing out that the multiplication was an entry-wise product, denoted by ⊗. The global search step is carried out by = xi + αL(s, λ), xt+1 i

(15.21)

where L(s, λ) is the Lévy step size to be drawn from a Lévy distribution approximated by L(s, λ) ∼

A sλ+1

,

(s 0).

(15.22)

Here, A > 0 is a constant and 0 < λ ≤ 2 is the exponent of the Lévy distribution.63 The proper realization of Lévy flights can be implemented using Mantegna’s algorithm.64 CS has been extended to multi-objective optimization.65 CS and its variants have many applications.66 • The flower pollination algorithm (FPA) is a population-based algorithm,67 inspired by the pollination characteristics of flowering plants.68 FPA mimics both biotic and abiotic pollination to carry out search moves in the parameter space. FPA has been extended to different variants with a wide range of applications,67, 69, 70 such as aircraft landing scheduling71 and engineering design optimization.72

page 481

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

482

b4205-v1-ch15

X.-S. Yang

Obviously, there are many other nature-inspired algorithms for optimization in the literature. We will not have space to review them. For more comprehensive reviews, please refer to recent articles and discussion about bio-inspired computation.4, 73 15.6. Discussions For the purpose of optimization, data mining, and computational intelligence, many alternative techniques have been developed. Most of these alternative techniques are nature-inspired optimization algorithms. Despite the extensive studies and their effectiveness, there is no single algorithm that can solve all the problems or a vast majority of problems efficiently. This is consistent with the so-called no-free-lunch (NFL) theorems,74 which have been proved in 1997. The main NFL theorem states that if an algorithm A outperforms another algorithm B for some problems, then algorithm B will outperform A for other problems. That is, the performance of A is equivalent to the performance B if the performance is averaged over all possible problems. However, we know that it is not necessary for performance to be averaged over all possible problems. In practice, the performance of an algorithm used for solving a class of problems is usually a metric over an individual problem or a small set of problems, which means that some algorithms are better than others. This is consistent with empirical observations. In fact, free lunches may exist for co-evolutionary algorithms,75 continuous problems,76 and multiobjective optimization.77 One of the most active research areas related to optimization and nature-inspired computation is machine learning. Traditional techniques such as neural networks are further improved in various ways and enhanced for deep learning.78, 79 Though nature-inspired algorithms can be surprisingly efficient for different applications, their theoretical understanding still lags behind. Researchers start to focus on such challenges issues.80 • Mathematical Analysis: We know these algorithms can work well in practice, but we rarely know why they work and under exactly

page 482

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch15

Swarm and Stochastic Computing for Global Optimization

483

what conditions. Thus, rigorous mathematical analysis of each algorithm and a set of algorithms should be carried out. There are some studies concerning the convergence of genetic algorithm,20 simulated annealing,27 particle swarm optimization,41, 42 bat algorithm59 and others. However, it still lacks a unified mathematical framework. • Parameter Tuning: Parameter settings in algorithms can largely affect the performance of an algorithm, but it is not straightforward to tune the parameters for a given algorithm.81 In the present literature, the tuning is largely empirical or by parametric studies. A systematical tuning tool should be developed. • Scalability: Though nature-inspired algorithms are efficient, the problems that have been solved are mainly at most moderate scale with a few dozens or at most a few hundred parameters. It is not clear if the algorithms that work well for small-scale or moderatescale optimization problems will work for large-scale problems by scaling up or by using parallelization. These challenging issues present good research opportunities, which can form active topics for further research. References 1. S. P. Boyd and L. Vandenberghe, Convex Optimization (Cambridge University Press, Cambridge UK, 2004). 2. X.-S. Yang, Nature-Inspired Optimization Algorithms (Elsevier Insight, London, 2014). 3. X.-S. Yang, Optimization Techniques and Applications with Examples (John Wiley & Sons, Hoboken, NJ, USA, 2018). 4. X.-S. Yang, Nature-Inspired Computation and Swarm Intelligence: Algorithms, Theory and Applications (Academic Press, Elsevier, London, 2020). 5. X.-S. Yang, Cuckoo Search and Firefly Algorithm: Theory and Applications. vol. 516, Studies in Computational Intelligence (Springer, Heidelberg, Germany, 2013). 6. K. Deb, Optimization for Engineering Design: Algorithms and Examples (Prentice-Hall, New Delhi, 1995). 7. L. C. Cagnina, S. C. Esquivel, and A. C. Coello Coello, Solving engineering optimization problems with the simple constrained particle swarm optimizer. Informatica 32(2), 319–326 (2008).

page 483

August 2, 2021

484

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch15

X.-S. Yang

8. J. L. Chabert, A History of Algorithms: From the Pebble to the Microchips (Springer-Verlag, Heidelberg, 1999). 9. I. M. Sobol, A Primer for the Monte Carlo Method (CRC Press, Boca Raton, FL, 1994). 10. C. J. Geyer, Practical markov chain monte carlo. Statist. Sci. 7(6), 473–511 (1992). 11. A. Ghate, and R. Smith, Adaptive search with stochastic acceptance probability for global optimization. Oper. Res. Lett. 36(3), 285–290 (2008). 12. A. M. Turing, Intelligent machinery, technical report. Technical report, National Physical Laboratory, London, UK (1948). 13. B. J. Copeland, The Essential Turing (Oxford University Press, Oxford, UK, 2004). 14. P. Judea, Heuristics (Addison-Wesley, New York, USA, 1984). 15. C. Bluma, and A. Roli, Metaheuristics in combinatorial optimization: overview and conceptural comparison. ACM Comput. Survey 25(2), 268– 308 (2003). 16. E. F. Keller, Organisms, machines, and thunderstorms: a history of selforganization, part ii. complexity, emergence, and stable attractors. Hist. Stud. Nat. Sci. 39(1), 1–31 (2009). 17. J. Holland, Adaptation in Nature and Artificial Systems (University of Michigan Press, Ann Arbor, MI, USA, 1975). 18. L. J. Fogel, A. J. Owens, and M. J.Walsh, Artificial Intelligence Through Simulated Evolution (Wiley, New York, USA, 1966). 19. D. B. Fogel, Evolutionary Computation: Toward a New Philosophy of Machine Intelligence (IEEE Press, Piscataway, NJ, 2006). 20. D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning (Addison-Wesley, Reading, MA, USA, 1989). 21. F. Glover, and M. Laguna, Tabu Search (Kluwer Academic Publishers, Boston, MA, USA, 1997). 22. J. R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection (MIT Press, Cambridge, MA, USA, 1992). 23. K. Deb, A. Pratap, S. Agarwal, and T. Mayarivan, A fast and elitist multiobjective algorithm: Nsga-ii. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002). 24. E. Zitzler, K. Deb, and L. Thiele, Comparison of multiobjective evolutionary algorithms: emperical results. Evol. Comput. 8(2), 173–195 (2000). 25. E.-G. Talbi, Metaheuristics: From Design to Implementation (John Wiley and Sons, New York, 2009). 26. S. Kirkpatrik, C. D. GEllat, and M. P. Vecchi, Optimization by simulated annealing. Science 220(4598), 671–680 (1983). 27. D. Bertsimas, and J. Tsitsiklis, Simulated annealing. Statist. Sci. 8(1), 10–15 (1993). 28. R. Storn, and K. Price, Differential evolution: A simple and efficient heuristic for global optimization. J Global Optimiz 11(4), 341–359 (1997).

page 484

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch15

Swarm and Stochastic Computing for Global Optimization

485

29. K. Price, R. Storn, and J. Lampinen, Differential Evolution: A Practical Approach to Global Optimization (Springer, Berlin, Germany, 2005). 30. J. Brest, S. Greiner, B. Boskovic, M. Mernik, and V. Zumer, Self-adapting control parameters in differential evolution: A comparative study on numerical benchmark functions. IEEE Trans. Evol. Comput. 10(6), 646–657 (2006). 31. S. Das, and P. Suganthan, Differential evolution: A survey of the state-ofthe-art. IEEE Trans. Evol. Comput. 15(1), 4–31 (2011). 32. X.-S. Yang, and S. Deb, Two-stage eagle strategy with differential evolution. Int. J. Bio-Inspired Comput. 4(1), 1–5 (2012). 33. L. Fisher, The Perfect Swarm: The Science of Complexity in Everyday Life (Basic Books, New York, 2009). 34. M. Dorigo, Optimization, Learning, and Natural Algorithms. Ph.D. Thesis, Politecnico di Milano, Milan, Italy (1992). 35. M. Dorigo, G. Di Caro, and L. Gambardella, Ant algorithms for discrite optimization. Artif. Life 5(2), 137–172 (1999). 36. S. Nakrani, and C. Tovey, On honeybees and dynamic server allocation in internet hosting centers. Adapt Behav. 12(3), 223–240 (2004). 37. D. Karaboga, An idea based on honeybee swarm for numerical optimization, techinical report. Technical report, Eriyes University, Turkey (2005). 38. D. Karaboga, and B. Basturk, On the performance of artificial bee colony (abc) algorithm. Appl. Soft Comput. 8(1), 687–697 (2008). 39. J. Kennedy, and R. Eberhart, Particle swarm optimization. In Proceedings of the IEEE International Conference on Neural Networks (IEEE, Piscataway, NJ, USA, 1995), pp. 1942–1948. 40. J. Kennedy, R.-C. Eberhart, and Y. Shi, Swarm Intelligence (Academic Press, London, UK, 2001). 41. M. Clerc, and J. Kennedy, The particle swarm: explosion, stability, and convergence in a multidimensional complex space. IEEE Trans. Evol. Comput. 6(1), 58–73 (2002). 42. M. Reyes-Sierra, and A. C. Coello Coello, Multi-objective particle swarm optimizers: a survey of the state-of-the-art. Int. J. Comput. Intell. Res. 2(3), 287–308 (2006). 43. X.-S. Yang, Firefly algorithms for multimodal optimization. In O. Watanabe and T. Zeugmann (eds.), Proceedings of Fifth Symposium on Stochastic Algorithms, Foundations and Applications, vol. 5792, Lecture Notes in Computer Science (Springer, 2009), pp. 169–178. 44. A. H. Gandomi, X.-S. Yang, S. Talatahari, and A. H. Alavi, Firefly algorithm with chaos. Commun. Nonlin. Sci. Numer. Simul. 2013(18), 1 (89–98). 45. A. Kaveh, and S. M. Javadi, Chaos-based firefly algorithms for optimization of cyclically large-size braced steel domes with multiple frequency constraints. Comput. Struct. 214(1), 28–39 (2019). 46. E. Osaba, X.-S. Yang, F. Diaz, E. Onieva, A. Masegosa, and A. Perallos, A discrete firefly algorithm to solve a rich vehicle routing problem modelling a newspaper distribution system with recycling policy. Soft Comput. 21(18), 5295–5308 (2017).

page 485

August 2, 2021

486

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch15

X.-S. Yang

47. X.-S. Yang, Multiobjective firefly algorithm for continuous optimization. Eng. Comput. 29(2), 175–184 (2013). 48. S. L. Tilahun, J. M. T. Ngnotchouye, and N. N. Hamadneh, Continous versions of firefly algorithm: A review. Artifi. Intelli. Rev. 51(3), 445–492 (2019). 49. J. Senthilnath, S. N. Omkar, and V. Mani, Clustering using firefly algorithm: performance study. Swarm Evolut. Comput. 1(3), 164–171 (2011). 50. F. D. Rango, N. Palmieri, X.-S. Yang, and S. Marano, Swarm robotics in wireless distributed protocol design for coordinating robots invovled in cooperative tasks. Soft Comput. 22(13), 4251–4266 (2018). 51. N. Palmieri, X.-S. Yang, F. D. Rango, and A. F. Santamaria, Self-adaptive decision-making mechanisms to balance the execution of multiple tasks for a multi-robots team. Neurocomputing 306(1), 17–36 (2018). 52. I. Fister, M. Perc, S. M. Kamal, and I. Fister, A review of chaos-based firefly algorithms: Perspectives and research challenges. Appl. Math. Comput. 252(1), 155–165 (2015). 53. S. L. Tilahun, and J. M. T. Ngnotchouye, Firefly algorithm for discrete optimization problems: A survey. KSCE J. Civil Eng. 21(2), 535–545 (2017). 54. X.-S. Yang, A new metaheuristic bat-inspired algorithm. In C. Cruz, J. R. Gonz´ alez, D. A. Pelta, G. Terrazas (eds.), Nature Inspired Cooperative Strategies for Optimization (NISCO 2010), vol. 284, Studies in Computational Intelligence (Springer, Berlin, Germany, 2010), pp. 65–74. 55. A. H. Gandom, and X.-S. Yang, Chaotic bat algorithm. J. Comput. Sci. 5(2), 224–232 (2014). 56. E. Osaba, X.-S. Yang, F. Diaz, P. Lopez-Garcia, and R. Carballedo, An improved discrete bat algorithm for symmetric and assymmetric travelling salesman problems. Eng. Appl. Artif. Intell. 48(1), 59–71 (2016). 57. E. Osaba, X.-S. Yang, I. F. Jr., P. Lopez-Garcia, and A. Vazquez-Paravila, A discrite and improved bat algorithm for solving a medical goods distribution problem with pharmacological waste collection. Swarm Evol. Comput. 44(1), 273–286 (2019). 58. A. Chakri, R. Khelif, M. Benouaret, and X.-S. Yang, New directional bat algorithm for continuous optimization problems. Exp. Syst. Appl. 69(1), 159– 175 (2017). 59. S. Chen, G.-H. Peng, Xing-Shi, and X.-S. Yang, Global convergence analysis of the bat algorithm using a markovian framework and dynamic system theory. Exp. Syst. Appl. 114(1), 173–182 (2018). 60. X.-S. Yang, and S. Deb, Cuckoo search via lévy flights. In Proceedings of World Congress on Nature & Biologically Inspired Computing (NaBIC 2009) (IEEE Publications, USA, 2009), pp. 210–214. 61. N. B. Davies, and M. L. Brooke, Co-evolution of the cuckoo and its hosts. Sci. Am. 264(1), 92–98 (1991). 62. N. B. Davies, Cuckoo adaptations: trickery and tuning. J. Zool. 284(1), 1–14 (2011). 63. I. Pavlyukevich, Lévy flights, non-local search and simulated annealing. J. Comput. Phys. 226(2), 1830–44 (2007).

page 486

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch15

Swarm and Stochastic Computing for Global Optimization

487

64. R. N. Mantegna, Fast, accurate algorithm for numerical simulation of lévy stable stochastic process. Phys. Rev. E. 49(5), 4677–83 (1994). 65. X.-S. Yang, and S. Deb, Multiobjective cuckoo search for design optimization. Comput. Oper. Res. 40(6), 1616–1624 (2013). 66. M. Shehab, A. T. Khader, and M. A. Al-Betar, A survey on applications and variants of the cuckoo search algorithm. Appl. Soft Comput. 61, 1041–1059 (2017). 67. X.-S. Yang, Flower pollination algorithm for global optimization. In J. Durand-Lose, and N. Jonoska (eds.), Unconventional Computation and Natural Computation (UCNC 2012, vol. 7445 (Springer, Berlin Heidelberg, Germany, 2012), pp. 240–249. 68. N. M. Waser, Flower constancy: definition, cause and measurement. Am. Nat. 127(5), 596–603 (1986). 69. Z. A. A. Alyasseri, A. T. Khader, M. A. Al-Betar, M. A. Awadallah, and X.-S. Yang, Variants of the flower pollination algorithm: A review. In X.-S. Yang (ed.), Nature-Inspired Algorithms and Applied Optimization (Springer, Cham, 2018), pp. 91–118. 70. M. Abdel-Basset, and L. A. Shawky, Flower pollination algorithm: a comprehensive review. Artif. Intelli. Rev. 52(4), 2533–2557 (2019). 71. A. A. A. Mahmud, D. Satakshi, and W. Jeberson, Aircraft landing scheduling using embedded flower pollination algorithm. Int. J. Pure Appl. Math. 119(16), 1719–1735 (2018). 72. X.-S. Yang, M. Karamanoglu, and X. He, Flower pollination algorithm: A novel approach for multiobjective optimization. Eng. Optimiz. 46(9), 1222–1237 (2014). 73. J. D. Ser, E. Osaba, D. Molina, X.-S. Yang, S. Salcedo-Sanz, D. Camacho, S. Das, P. N. Suganthan, C. A. C. Coello, and F. Herrera, Bio-inspired computation: Where we stand and what’s next. Swarm Evol. Computa. 48, 220–250 (2019). 74. D. H. Wolpert, and W. G. Macready, No free lunch theorems for optimization. IEEE Trans. Evolu. Comput. 1(1), 67–82 (1997). 75. D. H. Wolpert, and W. G. Macready, Coevolutionary free lunches. IEEE Trans. Evolu. Comput. 9(6), 721–735 (2005). 76. A. Auger, and O. Teytaud, Continuous lunches are free plus the design of optimal optimization algorithms. Algorithmica. 57(2), 121–146 (2010). 77. D. Corne, and J. Knowles, Some multiobjective optimizers are better than others. Evol. Comput. 4(2), 2506–2512 (2003). 78. Y. LeCun, Y. Bengio, and G. E. Hinton, Deep learning. Nature 521(7553), 436–444 (2015). 79. I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning (MIT Press, Cambridge, MA, 2017). 80. X.-S. Yang, and X.-S. He, Mathematical Foundations of Nature-Inspired Algorithms. Springer Briefs in Optimization (Springer, Cham, Switzerland, 2019). 81. A. S. Eiben, and S. K. Smit, Parameter tuning for configuring and analyzing evolutionary algorithms. Swarm Evol. Comput. 1(1), 19–31 (2011).

page 487

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 2, 2021

17:48

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch16

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 0016

Chapter 16

Vector Computation Karl Svozil Institute for Theoretical Physics, TU Wien, Wiedner Hauptstrasse 8-10/136, 1040 Vienna, Austria [email protected] Quantum physical resources are directional quantities that can be formalized by unit vectors or the associated orthogonal projection operators. When compared to classical computational states which are elements of (power) sets vector computations offer (dis)advantages.

16.1. Epistemology versus Ontology in the Quantum Computation Context In order to claim practical relevance, any notion and quantitative means of “computation” has to be ultimately grounded in physics, because information is physical,1 and so is the manipulation of information. The Church-Turing thesis — in Turing’s own words a man provided with paper, pencil, and rubber, and subject to strict discipline, is in effect a universal machine.2 — is a conjecture attempting to achieve just this goal: connecting physics to an appropriate formalism. Yet such conceptualizations bear, in their very success, a dangerous tendency to forget about the analogy and instead go for the formalism. Thereby nature is confused with formalism — indeed, our own partly formalized narratives about nature — just as theater3 is taken for life, or propaganda4 for fact. This results in beliefs in theoretical and hypothetical conceptual entities, the existence of

489

page 489

August 2, 2021

490

17:48

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch16

K. Svozil

various religious stigmas and social pressures, that taken together, amount[[ed]] to an evangelical crusade.5 Thereby, issues related to the formal representability of physical entities and processes are far from settled, and bordering on the mysterious.6 Yet, unlike Wigner, I believe that there may be at least two handy but mutually contradicting reasons for the effectiveness of mathematics in the natural sciences: one postulates that laws “emerge” from disorder.7–9 Another, apparently converse, reason for lawfulness is that we are inhabiting a virtual reality simulated by a computational process.10–14 In this hypothesis whatever “laws of nature” science might discover should be perceived as epistemic, intrinsic reflections of, or correspondences to, this ontological, extrinsic (to us) computation. The above rant served the purpose to suggest, and prepare the reader for, a cautious disengagement between epistemology and ontology — what might be claimed and believed to be known on the one hand, and what may be “lurking behind the detector clicks” on the other hand. Even with this proviso, one has to keep in mind that there is no “Archimedean point” or “ontological anchor” upon which an “objective reality” (whatever that is) can be based. In particular, whenever claims are issued about quantum resources, assets, or features — such as quantum parallelism by coherent superposition of classically distinct states, or entanglement as the relational encoding of multi-partite states — which might go beyond operationally established classical means and could give rise to quantum advantages, caution is advisable. Because in such cases the metaphor might supervene or “outperform” the simulated, yielding improper expectations and overstated, almost evangelical, claims.15 16.2. Types of Quantum Oracles for Randomness: Pure States in a Superposition Versus Mixed States There is a sombre fact of contemporary quantum physics: as of today the “measurement problem”, as contemplated by von Neumann,16

page 490

August 2, 2021

17:48

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Vector Computation

b4205-v1-ch16

491

Schr¨ odinger,17, 18 and repeated by Everett19, 20 and Wigner,21 despite numerous attempts to resolve it,22 remains disputed and unsolved. One of the issues is the simple mathematical impossibility to achieve irreversibility — the notorious “collapse of the wave function”, or state reduction at measurement point23 — from a completely reversible temporal evolution. Indeed this should be indisputable from “group theory 101”: the concatenation of unitary operators never yields outside of the realm of unitary and thus reversible transformations. For finite groups, Cayley’s theorem states that one essentially is dealing with permutations, with one-to-one transformations, with re-expressions, re-samplings of the same “message”. Stated differently a nesting argument16, 19–21 essentially “enlarges” the domain of reversibility to include whatever resources or regions of (Hilbert or configuration) space are necessary to re-establish reversibility, thereby disputing any irreversibility postulated by quantum mechanics.24–32 For all practical purposes33 quantum systems remain “epistemically” irreversible,34, 35 but so are classically reversible statistical systems, for which the entropy increase dissolves into thin air as one looks “closer” at individual constituents.36 Having just avoided the quantum Scylla of “irreversibility through reversibility” brings us closer to the quantum Charybdis of “quantum jellification” by the prevailing quantum superposition of classically distinct states of matter and mind, described so vividly in the late Schrödinger’s Dublin seminars,18 repeating the cat paradox17 in terms of jellyfish. Without measurement — how can we and everything around us “remain stable” and not dissolve into the chasm of coherent superposition, and how come that, despite this, we experience a fairly unique cognition and presence? Dirac’s “why bother?” objection33 to all of this might be to not bother at the moment and leave the task of solving these issues to future generations. Feynman went a step further and demanded to cease thinking about them, thereby taking the formalism for granted (like a gospel) as given, and thus effectively shut-up and calculate.37–39 However, such a repression strategy has consequences: first of all, as mentioned earlier the deterministic, unitary evolution of

page 491

August 2, 2021

492

17:48

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch16

K. Svozil

quantum states (through nesting) contradicts assumptions about irreversible measurements. Thereby the ontology of the alleged irreducible randomness of single events through measurements of a coherent superposition of classically distinct or even mutually exclusive states remains open. Under such circumstances no justification for their stochastic character, no “quantum certification”40 of, in theologic terminology, creatio continua, can be given, because consistency issues are unresolved and currently abolished, thereby either relegating a resolution of the argument to the future or not addressing the ontology at all. Quantum uncertainty and random outcomes via irreversible measurements of coherent superposition therefore remains conjectural. Another related ontological issue regards the existence of mixed states. First of all, the same issues as for measurements are pertinent: how does one obtain a mixed state from pure ones by one-to-one unitary state evolution? Claiming to be able to obtain mixed states from pure ones amounts to pretending to get along with outright mathematical incorrectness. One formal way of generating mixed states from pure multipartite states is by taking the partial trace with respect to the Hilbert space of one particle, a “beam dump” of sorts.34, 35 As the trace is essentially a many-to-one operation, irreversibility ensues. This can be easily corroborated by the non-uniqueness of purification which can be envisioned as the “quasi-inverse” (because, strictly speaking, due to non-uniqueness there is no inverse) of the partial trace [41, Section 8.3.1]. So what is the ontologic status of mixed quantum states? This again is unknown and may again, by the “why bother?” objection,33 be relegated until some later times. Again the question arises: how can we trust quantum random sequences originating from mixed state measurements? Even more convoluted is the question if there is any criterion or difference between sequences generated by measurement of pure states which are in a coherent superposition on the one hand, and by measurement of mixed quantum states on the other hand. As we cannot even formally grasp consistently the notions of irreversible

page 492

August 2, 2021

17:48

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch16

Vector Computation

493

measurement and production of mixed states we ought to accept our total loss to conceptualize the differences or similarities between them. We are left with the hope that, as both issues appear to have the same group-theoretic roots, any solution of one will, by reduction, entail a solution of the other. Another “fashionable” attempt to ascertain and thereby certify quantum randomness involving “quantum contextuality”42, 43 would be to assume the co-existence of complementary observables and, relative to the respective implicit assumptions (most notably context independence), prove (by contradiction or statistical demonstration) the impossibility for any value definiteness of complementary, incompatible observables prior to measurements.43–47 Suffice it to say that, as these hypothetical arguments go they are contingent on the respective counterfactual configurations imagined, and thus appear subjective and inconclusive.48

16.3. Questionable Parallelism by Views on a Vector From now on only pure states, represented by (unit) vectors spanning a 1D subspace of a vector space, will be considered. Schr¨ odinger’s 18 aforementioned question regarding quantum jellification, a variant of his earlier “cat paradox”,17 might be brought to practical use for quantum parallelization: for the sake of irritation suppose, as is often alleged in quantum computations, that the many mutually exclusive (classical) states in a coherent superposition be not alternatives but all really happen simultaneously . . . if the laws of nature took this form for, let me say, a quarter of an hour, we should find our surroundings rapidly turning into a quagmire, or sort of a featureless jelly or plasma, all contours becoming blurred, we ourselves probably becoming jelly fish. One of the simplest answers that can be given to these concerns is that this is just an epistemic question arising from a “wrong” viewpoint or perspective. Because if one chooses an orthonormal basis of Hilbert space of which the state vector is an element then the coherent superposition reduces to a single, unique term — namely, that unit vector. Unitary quantum evolution amounts to “rotating”

page 493

August 2, 2021

494

17:48

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch16

K. Svozil

this vector in Hilbert space. There is no “jellification” of this state with respect to the “proper” bases (for dimensions higher than 2 there is a continuum of bases) of which the state is one part. In this view, there is no “simultaneous existence” of two or more classically distinct states. All other probabilistic views are mere (continuity of) multiplicities, projections of sorts. For a similar, more formalized, vision see Gleason’s remarks on taking the square of the norm of the projection in the second paragraph of Ref. [49]. Indeed, one can find some hints on this solution in Schrödinger’s own writings on the Vedantic vision [50, Chapter V]: the plurality that we perceive is only an appearance; it is not real. Vedantic philosophy, in which this is a fundamental dogma has sought to clarify it by a number of analogies, one of the most attractive being the manyfaceted crystal which, while showing hundreds of little pictures of what is in reality a single existent object, does not really multiply that object. This might be considered bad news for quantum parallelism. Because if any such parallelism is based upon epistemology — about appearance without any substantial reality — all that remains as a resource is our ability to measure properties of the vector from a “different angle” than the one this vector has been defined. This is particularly pertinent for the “extraction” of information in a coherent superposition by classical irreversible measurements (cf. my earlier comments in Section 16.2) “reducing” a coherent superposition of a exponential variety of hypothetically conceivable classical distinct states to a single such state. Because what good is it to contemplate about such a counterfactual variety if we have no direct access to it? It gets even more problematic as we have postulated that the “extraction”, the outcome corresponding to this single state, occurs without sufficient reason51 and thereby eventualizes in an irreducibly stochastic40 manner, a situation denoted in theology by creatio continua: one can get only a single click or outcome per experiment, corresponding to an “exponential reduction” of computational states with respect to the state vector representation prior to extraction.

page 494

August 2, 2021

17:48

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Vector Computation

b4205-v1-ch16

495

16.4. Computation by Projective Measurements of Partitioned State Space Nevertheless, even in this reduced scheme of parallelism, it might be possible to formulate relational queries corresponding to useful information by appropriate partitioning of the state space.52–54 We may formulate the fundamental problem of intrinsically operational vector encoding of a computation aka quantum computation: under what circumstances is it possible to derive “useful” information about (the components) of a vector? It is not too unreasonable to suspect that an answer to this question can be given in terms of relational properties of the vector encoding.55, 56 Deutsch’s algorithm [41, Section 1.4.3] as well as the quantum Fourier transform based on period finding [41, Section 5.4.1] may be examples for such relational encodings realizable by state mismatches (“prepare one state, measure another”). More generally, the partitioning of finite groups by cosets, in particular, the hidden subgroup problem [41, Section 5.4.3] maybe a way to systematically exploit views on vectors, but so far there exists only anecdotal evidence [41, Figure 5.5, p. 241] for that. There is a way to extract information from a quantum state by constructing proper (with respect to the computational task or query) subspaces and the orthogonal projection operators onto such subspaces. How ought this computational method be understood? It will be argued that it could be perceived in terms of (equi)partitioning the quantum state space, and, as mentioned earlier, by using the respective projection operators as filters.52–54 Suppose it is possible to encode (the solution to) a problem into a Hilbert space spanned by a collection of pure state vectors encoding functional instances and functional properties into oneand higher-dimensional subspaces thereof. For instance, a binary function of several bits can be encoded into quantum states (i) by state vectors whose first components with respect to an orthonormal basis (normalization aside) are either 0 or 1, depending on whether or not the function on the inputs evaluates to 0 or 1, and (ii) later

page 495

August 2, 2021

496

17:48

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch16

K. Svozil

“auxiliary” components are added to ensure mutual orthogonality of the state vectors. That is, in order to obtain an orthonormal basis one could, for instance, employ dimensional lifting, and thereby enlarge the Hilbert space. This results in an orthonormal (after normalization) basis of the aforementioned Hilbert space. Therefore the elements of the orthonormal basis represent the individual instances of the function or problem. Now if one forms the orthogonal projection operator as the sum of the dyadic products of the respective vectors of the orthonormal basis of the subspace encoding a particular problem or a query, then this projection operator is capable of solving the property or problem it was encoded to solve in a single run. All that needs to be done is apply this projection “filter” to a state encoding some arbitrary problem instance. Let me demostrate this by an example which is a generalized Deutsch algorithm. Consider arbitrary binary functions of n classical bits. Suppose an unknown arbitrary such function is given, and suppose that the question is not which function exactly it is, but about a relational property which for instance refers to “common” or “different” properties of functions of this class; say parity. How could one find such a particular property without having to identify the respective function completely? It is not too diffin cult to argue that there are 2n possible arguments and 22 such binary functions of n bits. For the sake of a reasonable “small” demonstration, take n = 2 (n = 1 amounts to Deutsch’s problem; cf. Refs. [23, Section 2.2; 41, Section 1.4.3]). Table 16.1 enumerates all binary functions of two classical bits. (At this point we are not dealing with questions of enlarging the Hilbert space to obtain overall reversibility in case the functions are not reversible.) In the next step, a system of vectors |ei is obtained by identifying the valuations of the functions on the respective bit values with entries in coordinate tuples, as enumerated in the second to last column of Table 16.1. Based on these vectors an orthonormal basis of a (subspace) of a high-dimensional Hilbert space can be

page 496

August 2, 2021

01

10

11

f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15 f16

0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1

0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1

0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

Corresponding vector |e1 = 0, 0, 0, 0 |e2 = 0, 0, 0, 1 |e3 = 0, 0, 1, 0 |e4 = 0, 0, 1, 1 |e5 = 0, 1, 0, 0 |e6 = 0, 1, 0, 1 |e7 = 0, 1, 1, 0 |e8 = 0, 1, 1, 1 |e9 = 1, 0, 0, 0 |e10 = 1, 0, 0, 1 |e11 = 1, 0, 1, 0 |e12 = 1, 0, 1, 1 |e13 = 1, 1, 0, 0 |e14 = 1, 1, 0, 1 |e15 = 1, 1, 1, 0 |e16 = 1, 1, 1, 1

Vector after dimensional lifting (20 dimensions) |b1 = 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 |b2 = 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 |b3 = 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 |b4 = 0, 0, 1, 1, 0, −1, −1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 |b5 = 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 |b6 = 0, 1, 0, 1, 0, −1, 0, −2, −1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 |b7 = 0, 1, 1, 0, 0, 0, −1, −2, −1, −6, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 |b8 = 0, 1, 1, 1, 0, −1, −1, −4, −1, −12, −84, 1, 0, 0, 0, 0, 0, 0, 0, 0 |b9 = 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0 |b10 = 1, 0, 0, 1, 0, −1, 0, −2, 0, −6, −40, −3442, −1, 1, 0, 0, 0, 0, 0, 0 |b11 = 1, 0, 1, 0, 0, 0, −1, −2, 0, −4, −30, −2578, −1, −8874706, 1, 0, 0, 0, 0, 0 |b12 = 1, 0, 1, 1, 0, −1, −1, −4, 0, −10, −70, −6020, −1, −20723712, ·, 1, 0, 0, 0, 0 |b13 = 1, 1, 0, 0, 0, 0, 0, 0, −1, −2, −14, −1202, −1, −4137858, ·, ·, 1, 0, 0, 0 |b14 = 1, 1, 0, 1, 0, −1, 0, −2, −1, −8, −54, −4644, −1, −15986864, ·, ·, ·, 1, 0, 0 |b15 = 1, 1, 1, 0, 0, 0, −1, −2, −1, −6, −44, −3780, −1, −13012562, ·, ·, ·, ·, 1, 0 |b16 = 1, 1, 1, 1, 0, −1, −1, −4, −1, −12, −84, −7222, −1, −24861568, ·, ·, ·, ·, ·, 1

497

b4205-v1-ch16

Note: Dots as vector components represent “very large” numbers.

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

00

The 16 binary functions of two classical bits.

Vector Computation

f#

17:48

Table 16.1.

page 497

August 2, 2021

17:48

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch16

K. Svozil

498

effectively generated by dimensional lifting,57 so that |ei → |bi with bi |bj = δij and i, j = 1, . . . , 16 as enumerated (without normalization) in the last column of Table 16.1. Thereby the zero vector |e1 can, for instance, be ad hoc mapped into a subspace of this larger dimensional Hilbert space which is orthogonal to all other subspaces (this may be achieved by adding another dimension). Note that (i) In general, dimensional lifting is not unique — there exists other, rather inefficient methods58 (with respect to the number of auxiliary extra dimensions) to orthogonalize the vectors corresponding to the functions fi . (ii) Dimensional lifting does not correspond to a unitary transformation as it intentionally changes the inner products (to become zero) in transit to higher dimensions. Therefore, if one attempts to encode this kind of problems into orthogonal bases of subspaces of higher dimensional Hilbert spaces one needs to take care of orthogonality from the very beginning. That is, there has to be a physically feasible way to map the functions fi into |bi . (iii) Accordingly, any way to map the 16 functions fi into any kind of system of orthogonal vectors suffices for this method as long as it is physically feasible. In the final step a filter is designed which models the binary question by projecting the answers onto the appropriate subspace of the (sub)space spanned by the orthonormal basis |b1 , . . . , |b16 such that the question can be answered in a single query. Suppose the question is to find the parity of a function fi (x, y) ∈ {0, 1}) with x, y ∈ {0, 1}, i ∈ {1, we need to do is . . . , 16}. All to partition the functional space f1 , . . . , f16 into functions with an even or an odd number of outputs “1”. More explicitly, for the functions enumerated in Table 16.1 and for parity, the partition is f2 , f3 , f5 , f8 , f9 , f12 , f14 , f15 , f1 , f4 , f6 , f7 , f10 , f11 , f13 , f16 ,

(16.1)

page 498

August 2, 2021

17:48

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch16

Vector Computation

499

corresponding to the orthogonal projection operators E1 = |b2 b2 | + |b3 b3 | + |b5 b5 | + |b8 b8 | + |b9 b9 | + |b12 b12 | + |b14 b14 | + |b15 b15 |, and E0 = 1 − E1 = |b1 b1 | + |b4 b4 | + |b6 b6 | + |b7 b7 | + |b10 b10 | + |b11 b11 | + |b13 b13 | + |b16 b16 |. (16.2) The parity of an unknown given binary function of two bits can be obtained by a single query measuring the propositional “parity property” associated with the observable E1 = 1 − E0 . In principle this method can be generalized to the parity problem of binary n functions of n bits, utilizing a parallelization of the order of 22 at the cost of expanding Hilbert space to about twice this number of dimensions. It remains to be seen whether this method violates the assumptions in Ref. [59]. In any case it should be noted that parity is an example of a much wider problem class associated with relational properties which can be represented or parametrized by partitioning appropriate subspaces of Hilbert space. 16.5. Entanglement as Relational Parallelism Across Multi-Partite States From a purely formal point of view, entangled particles are modeled by the indecomposability of state vectors in a Hilbert space which is a non-trivial tensor product of two or more Hilbert spaces. Indecomposability means that the respective state vector cannot be decomposed into a single product of factors of the states of the constituent particles. Instead, an entangled state can be written as the coherent superposition (aka linear combination) of such product states. This immediately suggests that whenever such indecomposable vectors occur they can be “rotated” by unitary transformations into a single product form; say a vector of the Cartesian standard basis in the respective tensor product of two or more Hilbert spaces. Any such transformation cannot be expected to be acting “locally” in a single

page 499

August 2, 2021

17:48

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

500

b4205-v1-ch16

K. Svozil

constituent space but rather “globally” across the single constituent spaces. Physically this means that we are not dealing with singleparticle properties but again with relational properties; whereby relational information is encoded across the multiple constituents of such a state.17, 60 This can be expected: as unitary transformations are defined by rotations transforming some orthonormal basis into another orthonormal basis61 this amounts to a kind of “zero-sum game” between localized information aka properties on individual constituents on the one hand, and relational information which for instance refers to “common” or “different” properties within collectives or groups of constituents on the other hand. 16.6. On Partial Views of Vectors In the context of state purification the following more general question arises: What kind of value might a partial knowledge of or about a vector have? After all, embedded observers11 may obtain only partial knowledge and control of the degrees of freedom entailed by overseeing a “small” subset of a “much larger” Hilbert space they inhabit. Suppose an observer has acquired knowledge about an incomplete list of components (relative to a particular basis) of a pure state vector. This can be formalized either by a projection of this vector onto a subspace of the Hilbert space or by “extraction” of the coordinates by the respective vectors of the dual basis of the dual space. One possibility would be to “complete” the state vector by various procedures, such as the aforementioned purification. Alternatively one might consider the subspace spanned by both the given and the missing basis elements.62 16.7. Summary I have presented a revisionist glance at quantum computation as seen from an equally revisionist perspective on quantum theory. The main télos, that is, the end, purpose, or goal, of these considerations

page 500

August 2, 2021

17:48

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Vector Computation

b4205-v1-ch16

501

rests in the emphasis that only with a proper understanding of the quantum physical resources it is possible to develop a comprehensive theory of quantum information and computation. For instance, the mere communal or individual canonical believe in ontological quantum randomness — without mentioning the implicit assumptions or corroborations which are essentially based upon incapacity; that is the experience that nobody so far has come up with any causes and necessary and sufficient reasons for quantum outcomes — suggests that any such claims need be viewed as epistemic, anecdotal and preliminary. I believe that quantum information and computation are intimately tied to foundational issues. Therefore, such issues need at least to be kept in mind if one assesses the capacities of quantum systems to store and process information.

Acknowledgments Parts of this discussion have been inspired by a discussion with Noson Yanofsky about whether quantum superpositions and entanglements make the universe unboring. This research was funded in whole, or in part, by the Austrian Science Fund (FWF), Project No. I 4579-N.

References 1. R. Landauer, Physics Today 44(5), 23 (1991), https://doi.org/10.1063/ 1.881299. 2. A. M. Turing, In B. J. Copeland (ed.), The Essential Turing (Oxford University Press, Oxford and New York, 2004), https://global.oup.com/ academic/product/the-essential-turing-9780198250807. 3. A. Artaud, The Theatre and Its Double (Alma Classics Limited, Richmond Surrey, UK, 2010 (1938, 1964)), translated by Victor Corti, https:// almabooks.com/product/the-theatre-and-its-double/. 4. E. Bernays, Propaganda (Routledge, 1928), http://self.gutenberg.org/ Get956uFile.aspx?&bookid=100002651. 5. J. F. Clauser, in R. Bertlmann and A. Zeilinger, eds., Quantum (Un)speakables: From Bell to Quantum Information (Springer, Berlin, 2002), pp. 61–96, ISBN 978-3-662-05032-3,978-3-540-42756-8, https://doi.org/10. 1007/978-3-662-05032-3 6.

page 501

August 2, 2021

502

17:48

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch16

K. Svozil

6. E. P. Wigner, Commun. Pure Appl. Math. 13, 1 (1960), https://doi.org/ 10.1002/cpa.3160130102. ¨ 7. F. S. Exner, Uber Gesetze in Naturwissenschaft und Humanistik: Inaugurationsrede gehalten am 15. Oktober 1908 (H¨ older, Ebooks on Demand Universit¨ atsbibliothek Wien, Vienna, 1909, 2016), handle https://hdl.handle.net/ 11353/10.451413, o:451413, Uploaded: 30.08.2016, http://phaidra.univie.ac. at/o:451413. 8. N. S. Yanofsky, Finding order in chaos (2017), preprint, received March 20th, 2017, http://www.sci.brooklyn.cuny.edu/∼noson/FindingOrder.pdf. 9. C. S. Calude and K. Svozil, Philosophies 4(2), 17 (2019), arXiv:1812.04416, https://doi.org/10.3390/philosophies4020017. 10. K. Zuse, Calculating Space. MIT Technical Translation AZT-70-164-GEMIT (MIT (Proj. MAC), Cambridge, MA, 1970). 11. T. Toffoli, In G. J. Klir (ed.), Applied General Systems Research: Recent Developments and Trends (Plenum Press, Springer US, New York, London, and Boston, MA, 1978), pp. 395–400, ISBN 978-1-4757-0555-3, https://doi. org/10.1007/978-1-4757-0555-3 29. 12. D. F. Galouye, Simulacron 3 (Bantam Books, New York, 1964). 13. G. Egan, Permutation city (1994), ISBN 006105481X,9780061054815, accessed on January 4, 2017, http://www.gregegan.net/PERMUTATION/ Permutation.html. 14. N. Bostrom, The Philos. Quart. 53(211), 243 (2003), https://www.simulati on-argument.com/simulation.pdf, https://doi.org/10.1111/1467-9213.00309. 15. K. Svozil, Ethics Sci. Environ. Politi.(ESEP) 16(1), 25 (2016), arXiv: 1605.08569, https://doi.org/10.3354/esep00171. 16. J. von Neumann, Mathematische Grundlagen der Quantenmechanik , 2nd edn. (Springer, Berlin, Heidelberg, 1932, 1996), ISBN 978-3-642-614095,978-3-540-59207-5,978-3-642-64828-1, https://doi.org/10.1007/978-3-64261409-5. 17. E. Schr¨ odinger, Naturwissenschaften 23, 807 (1935), https://doi.org/10. 1007/BF01491891, https://doi.org/10.1007/BF01491914, https://doi.org/ 10.1007/BF01491987. 18. E. Schr¨ odinger, The Interpretation of Quantum Mechanics. Dublin Seminars (1949–1955) and Other Unpublished Essays (Ox Bow Press, Woodbridge, Connecticut, 1995). 19. H. Everett III, Rev. Mod. Phys. 29, 454 (1957), https://doi.org/10.1103/ RevModPhys.29.454. 20. H. Everett III, In J. A. Barrett and P. Byrne (eds.), The Everett Interpretation of Quantum Mechanics: Collected Works 1955–1980 with Commentary (Princeton University Press, Princeton, NJ, 1956,2012), pp. 72–172, ISBN 9780691145075, http://press.princeton.edu/titles/9770.html. 21. E. P. Wigner, In I. J. Good (ed.), The Scientist Speculates (Heinemann, Basic Books, and Springer-Verlag, London, New York, and Berlin, 1961, 1962, 1995), pp. 284–302, https://doi.org/10.1007/978-3-642-78374-6 20. 22. F. London and E. Bauer, in Quantum Theory and Measurement (Princeton University Press, Princeton, NJ, 1983), pp. 217–259.

page 502

August 2, 2021

17:48

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Vector Computation

b4205-v1-ch16

503

23. D. N. Mermin, Quantum Computer Science (Cambridge University Press, Cambridge, 2007), ISBN 9780521876582, https://doi.org/10.1017/ CBO9780511813870. 24. A. Peres, Phys. Rev. D 22(4), 879 (Aug 1980), https://doi.org/10.1103/ PhysRevD.22.879. 25. M. O. Scully and K. Dr¨ uhl, Phys. Rev. A 25(4), 2208 (April 1982), https:// doi.org/10.1103/PhysRevA.25.2208. 26. D. M. Greenberger and A. YaSin, Found. Phys. 19(6), 679 (1989), https:// doi.org/10.1007/BF00731905. 27. M. O. Scully, B.-G. Englert, and H. Walther, Nature 351, 111 (May 1991), https://doi.org/10.1038/351111a0. 28. A. G. Zajonc, L. J. Wang, X. Y. Zou, and L. Mandel, Nature 353, 507 (October 1991), https://doi.org/10.1038/353507b0. 29. P. G. Kwiat, A. M. Steinberg, and R. Y. Chiao, Phys. Rev. A 45(11), 7729 (Jun 1992), https://doi.org/10.1103/PhysRevA.45.7729. 30. T. Pfau, S. Sp¨ alter, C. Kurtsiefer, C. R. Ekstrom, and J. Mlynek, Phys. Rev. Lett. 73(9), 1223 (Aug 1994), https://doi.org/10.1103/PhysRevLett.73.1223. 31. M. S. Chapman, T. D. Hammond, A. Lenef, J. Schmiedmayer, R. A. Rubenstein, E. Smith, and D. E. Pritchard, Phys. Rev. Lett. 75(21), 3783 (November 1995), https://doi.org/10.1103/PhysRevLett.75.3783. 32. T. J. Herzog, P. G. Kwiat, H. Weinfurter, and A. Zeilinger, Phys. Rev. Lett. 75(17), 3034 (1995), https://doi.org/10.1103/PhysRevLett.75.3034. 33. J. S. Bell, Phys. World 3, 33 (1990), https://doi.org/10.1088/2058-7058/3/ 8/26. 34. B.-G. Englert, J. Schwinger, and M. O. Scully, Found. Phys. 18(10), 1045 (1988), https://doi.org/10.1007/BF01909939. 35. J. Schwinger, M. O. Scully, and B.-G. Englert, Zeitschrift f¨ ur Physik D: Atoms, Molecules and Clusters 10(2-3), 135 (1988), https://doi.org/10.1007/ BF01384847. 36. W. C. Myrvold, Stud. History Philos. Sci. Part B: Stud. History Philos. Mod. Phys. 42(4), 237 (2011), https://doi.org/10.1016/j.shpsb.2011.07.001. 37. R. P. Feynman, The Character of Physical Law (MIT Press, Cambridge, MA, 1965), ISBN 9780262060165,9780262560030, https://mitpress.mit.edu/ books/character-physical-law. 38. D. N. Mermin, Phys. Today 42, 9 (1989), https://doi.org/10.1063/1.2810963. 39. D. N. Mermin, Phys. Today 57, 10 (1989), https://doi.org/10.1063/1. 1768652. 40. A. Zeilinger, Nature 438, 743 (2005), https://doi.org/10.1038/438743a. 41. M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information (Cambridge University Press, Cambridge, 2010), 10th Anniversary Edition, https://doi.org/10.1017/CBO9780511976667. 42. K. Svozil, Phys. Rev. A 79(5), 054306, 054306 (3 pages) (2009), arXiv:quantph/0903.2744, https://doi.org/10.1103/PhysRevA.79.054306. 43. K. Svozil, Entropy 22(6), 602 (May 2020), arXiv:1707.08915, https://doi. org/10.3390/e22060602.

page 503

August 2, 2021

504

17:48

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch16

K. Svozil

44. S. Kochen and E. P. Specker, J. Math. Mech. (now Indiana Univ. Math. J.) 17(1), 59 (1967), https://doi.org/10.1512/iumj.1968.17.17004. 45. M. Froissart, Il Nuovo Cimento B (1971–1996) 64, 241 (1981), 10.1007/BF02903286, https://doi.org/10.1007/BF02903286. 46. I. Pitowsky, J. Math. Phys. 39(1), 218 (1998), https://doi.org/10.1063/ 1.532334. 47. A. A. Abbott, C. S. Calude, and K. Svozil, J. Math. Phys. 56(10), 102201(1, 102201 (2015), arXiv:1503.01985, https://doi.org/10.1063/1.4931658. 48. K. Svozil, Quantum Reports 2(2), 278 (2020), arXiv:1808.00813, https://doi. org/10.3390/quantum2020018. 49. A. M. Gleason, J. Math. Mech. (now Indiana Univ. Math. J.) 6(4), 885 (1957), https://doi.org/10.1512/iumj.1957.6.56050. 50. E. Schr¨ odinger, My View of the World (Cambridge University Press, Cambridge, UK, 1951), ISBN 9780521062244,9781107049710, https://doi. org/10.1017/CBO9781107049710. 51. Y. Y. Melamed and M. Lin, In E. N. Zalta (ed.), The Stanford Encyclopedia of Philosophy (Metaphysics Research Lab, Stanford University, 2020), spring 2020 ed., https://plato.stanford.edu/archives/spr2018/entries/ sufficient-reason/. 52. N. Donath and K. Svozil, Phys. Rev. A 65, 044302 (2002), arXiv:quantph/0105046, https://doi.org/10.1103/PhysRevA.65.044302. 53. K. Svozil, Phys. Rev. A 66, 044306 (2002), arXiv:quant-ph/0205031, https://doi.org/10.1103/PhysRevA.66.044306. 54. K. Svozil, J. Mod. Opt. 51, 811 (2004), arXiv:quant-ph/0308110, https:// doi.org/10.1080/09500340410001664179. 55. K. Svozil and J. Tkadlec, Nat. Comput. (2009), https://doi.org/10.1007/ s11047-009-9112-5. 56. K. Svozil. In L. Grandinetti, S. L. Mirtaheri, and R. Shahbazian (eds.), High-Performance Computing and Big Data Analysis (Springer International Publishing, Cham, 2019), vol. 891 of Communications in Computer and Information Science, pp. 504–512, ISBN 978-3-030-33494-9, 978-3-030-334956, arXiv:1904.08307, https://doi.org/10.1007/978-3-030-33495-6 39. 57. H. Havlicek and K. Svozil, Entropy 20(4), 284(5) (2018), arXiv:1606.03873, https://doi.org/10.3390/e20040284. 58. K. Svozil, Entropy 18(5), 156 (2016), arXiv:1601.07106, https://doi.org/ 10.3390/e18050156. 59. E. Farhi, J. Goldstone, S. Gutmann, and M. Sipser, Phys. Rev. Lett. 81, 5442 (1998), arXiv:quant-ph/9802045, https://doi.org/10.1103/PhysRevLett.81. 5442. 60. A. Zeilinger, Found. Phys. 29(4), 631 (1999), https://doi.org/10.1023/A: 1018820410908. 61. J. Schwinger, Proc. Nat. Acad. Sci. (PNAS) 46, 570 (1960), https://doi.org/ 10.1073/pnas.46.4.570. 62. (2020), arXiv:2010.09506, http://arxiv.org/abs/2010.09506.

page 504

August 2, 2021

17:48

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Vector Computation

b4205-v1-ch16

505

63. J. von Neumann, Mathematical Foundations of Quantum Mechanics (Princeton University Press, Princeton, NJ, 1955), ISBN 9780691028934, http://press.princeton.edu/titles/2113.html. 64. F. London and E. Bauer, La theorie de l’observation en mécanique quantique; No. 775 of Actualités scientifiques et industrielles: Exposés de physique générale, publiés sous la direction de Paul Langevin (Hermann, Paris, 1939).

page 505

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 0017

Chapter 17

Unsupervised Learning Approach Using Reinforcement Techniques on Bio-inspired Topologies Karolos-Alexandros Tsakalos∗,‡, Georgios Ch. Sirakoulis∗,§ and Andrew Adamatzky†,¶ ∗

Laboratory of Electronics, Department of Electrical and Computer Engineering (DECE), Democritus University of Thrace (DUTH), Panepistimioupoli DUTH Xanthi-Kimmeria, Xanthi, GR 67100, Greece † Unconventional Computing Laboratory, FET — Computer Science and Creative Technologies, University of the West of England (UWE), Bristol, UK ‡ [email protected] § [email protected] ¶ [email protected] Modeling complex bio-inspired networks is widely used in the research field of emerging computing, which promises rapid growth in the field of computer science. This work deals with bio-inspired molecular networks which have been studied through neuromorphic computing. This molecular-based structure is adapted to create a complex recurrent neuromorphic network that consists of neurons integrated with the simple Izhikevich neuromorphic model. Therefore molecular atoms are considered as neurons and chemical edges as synapses. More specifically, the molecular-based structure of Verotoxin-1 molecule has been extensively studied. Two Reinforcement excitation techniques inspired from Cellular Automata studies, namely, Game-of-Life (GoL)-rule and Majority-rule, are employed to control the stimulation of each neuron depending its neighbourhood activity. In this work, two different

507

page 507

August 2, 2021

508

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

K.-A. Tsakalos, G. Ch. Sirakoulis and A. Adamatzky CA-inspired unsupervised learning methods along with the neuroinspired Hebbian learning have also utilized the local activity to apply self-organization and update the recurrent synaptic weights to highlight complex neuromorphic clusters that are integrated into the existing molecular structure. Finally, by applying the proposed reinforcement excitation techniques along with the unsupervised learning, we investigate the potential of spatio-temporal signals classification through the proposed framework based on the molecular structure. The obtained results showed us this framework ability of of distinguishing highdimensional signals; in this sense, we further discuss about how these learning approach along with molecular-based structures can be utilized to learn and help us with different complex tasks in a wide range of applications such as the classification of multi-dimensional signals.

17.1. Introduction 17.1.1. Molecular networks The concept of networks is widely used to analyze and predict dynamics in a variety of complex systems.1, 2 When referring to networks, complex systems are analyzed into a set of elements that interact (nodes, vertices) and are interconnected through links (contacts, edges, interactions). Typically, these networks are represented by graphs in which links represent interactions between elements. Connections usually have some weight, which characterizes its strength (correlation, intensity or probability). Networks contain certain nodes called hubs, which are nodes that present high degree of connectivity, that is, they are connected to multiple nodes. As an exception to most of the self-organized networks, the distribution of molecular networks can be modified to be more Poisson-like.3 The Poisson distribution means that molecular ensembles have fewer hubs than most self-organized networks including most cellular or social networks. The primary reason for this deviation from scalefree distribution lies in the limitation of the simultaneous ability to bind amino acids to different side chains of other amino acids (also called an excitatory effect).3, 4 In molecular-based networks, nodes could represent parts of the molecule, while their connections are created taking into account the Euclidean distance between these parts.5 Molecular atoms can be

page 508

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

Unsupervised Learning Approach Using Reinforcement Techniques

509

considered as nodes of the network, since molecular cell elements are typically represented by amino acid chains.6–8 In unweighted molecular networks, an Euclidean cleavage distance (which is usually in the order of some ˚ A) is introduced and only the amino acid atoms are interconnected with unbound weights, which are usually closer together than the limit determined by the Euclidean cut-off distance.9 These networks are usually called amino acid networks or protein structure networks to distinguish themselves from the socalled “protein networks”, which is a widely used term for protein interaction networks. Thus, in the rest of this chapter, we use the term protein structure networks to describe networks that use protein topology. The protein structure networks were first used as a data mining form to help compare protein topologies and identify their structural similarities. 17.1.2. Cellular automata In the 1940s, John von Neumann and Stanislaw Ulam taking inspiration from nature proposed Cellular Automata (CAs) as a formal computing model which is capable of reproducing complex dynamics out of collective behaviour of locally-arranged computing processors, called CA cells.10 The next fundamental breakthrough in the history of the development of the homogeneous structure of CAs is due to Stephen Wolfram,11 who suggested a simplification of the cell structure with locally-arranged connections in one dimension. This approach of information processing in natural systems appeared more closely in CAs than in conventional von Neumann architectures.12 Previous studies proved that CAs are very efficient in modeling physical systems and addressing scientific problems, because they can capture the critical peculiarities of natural systems where collective behavior occurs from the combined effect of simple components which interact locally.13–16 In contrast to the conventional computing systems, the realization of CAs is inspired by parallelism, an inherent attribute of CAs that performs further acceleration of the modeling procedure. In CAs, the CA cell state is realized as memory and the CA local rules perform the processing. These inherent attributes of CAs are intrinsically related to the CA cell.17

page 509

August 2, 2021

510

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

K.-A. Tsakalos, G. Ch. Sirakoulis and A. Adamatzky

Against this background, being a well-studied computing framework, CAs provide computing efficacy and robustness through the global inter-working of simple components. CAs have been also extensively studied as a hardware architecture and emergent phenomena can be simply simulated in a widespread range of computer science applications such us transportation problems,18 physical-based systems simulations,19 crowd evacuation management,20–23 Physarum polycephalum behavior modeling,24–30 large scale environmental-based modeling,31–34 comprising various NPcomplete hard to be solved problems arriving from computer science field, like the well-known Shortest Path NP-hard problem35–38 or the travelling salesman problem,39 and simulating reaction diffusion processes of Belousov–Zhabotinsky reactions.34, 40 CAs are distinguished by massive parallelism, local interactions and inherent emergent computing derived by self-organization. Furthermore, the design of CA-based HW constitutes a reduced complexity task, since the CA architecture offers several advantageous features like simplicity since just a single CA cell design needs to be adjusted and all the rest system layout depends from the regular connectivity of the CA cells, the simplicity of CMOS-based fabrication and overall silicon-area utilization following the locality of interconnections.21, 24 Moreover, CA-based hardware utilizes the natural parallelism of CAs for the fast execution of operations meeting the demanding computational tasks of modern computing.26, 34, 41 17.1.3. Conway’s Game-of-Life Despite CAs was discovered in 1940s, their increasing reputation on the academic community can be spotted after 1970s, and in particular after the invention of the Game-of-Life (GoL) CA by Conway,42 who envisaged a way of emulating an infinite universe divided into CA cells, and later thanks to Stephen Wolfram’s contribution to the development of CAs, as he studied several 1D CA models and proved CAs’ significance to computing, statistical mechanics and cryptography.43, 44 After this demonstration, the potential applications of the GoL CAs have been further investigated in a broad range of scientific fields, such as biology, chemistry, image processing45 and

page 510

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

Unsupervised Learning Approach Using Reinforcement Techniques

511

computing46 as well as demonstrated as full universal Turing Machine that is capable of performing universal computations.47, 48 The GoL CA is a Boolean CA on a 2D regular grid; its state transition rules depend on the states of cells in the Moore neighborhood. In specific, the proposed transition rule is an outer totalistic rule, since it depends individually on the previous state of a central cell and on the states of cells in the Moore neighborhood. Conway’s purpose was to study the simplest available configuration for a general purpose computer and the resulting emergent computation when computing algorithms were found at a premature stage regarding complexity and computational resources needs. Briefly, according to the Conway’s GoL CA, the state of the CA cells is binary and corresponds to either “dead” or “alive”. In case of a living cell that has less than two or more than three alive neighboring cells, it “dies” out of under-population or overpopulation, accordingly. For the remaining cases, it stays “alive”, while a “dead” cell only becomes “alive”, that is, “resurrects” when exactly three neighboring cells are “alive”. Figure 17.1 illustrates an example snapshot of GoL-based Turing machine.47 17.1.4. Neuromorphic computing systems Neuromorphic computing systems have been proposed for a purpose of bio-mimicking neuro-biological networks presented in the mammalian cortex through the use of VLSI hardware systems containing nanoelectronic analog-based devices.49 This approach presents an attractive unravelling solution for implementing emerging computing and beyond von Neumann architectures by using nanoelectronic devices that integrate neuromorphic computing models.50 The present chapter focuses on an alternative approach that aims at high performance computing by modeling a compact, parallel and energy efficient structure that is capable of performing neuromorphic computations using a bio-inspired neuron model. The structure of this chapter is followed with a view of reviewing Molecular Networks and revealing their inherent features. Afterwards, we present the Spiking Neural Networks and describe how they operate through different bio-inspired neuron models. We present the

page 511

August 2, 2021

512

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

K.-A. Tsakalos, G. Ch. Sirakoulis and A. Adamatzky

Figure 17.1. Turing machine implemented in Conway’s Game-of-Life CA (adopted from Ref. [47]).

Simple Izhikevich model which is followed by the proposal of two reinforcement excitation techniques, namely, GoL-rule and Majorityrule, respectively, that are employed in our simulations to control the stimulation of neurons. Moving on, we introduce two different CA-inspired Unsupervised learning methods along with the Neuroinspired Hebbian learning of self-organization which were used on the molecular-based topology. And finally, we apply the proposed reinforcement excitation techniques along with the unsupervised learning in order to investigate their learning potential of classifying spatio-temporal signals in neuromorphic classification tasks.

page 512

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

Unsupervised Learning Approach Using Reinforcement Techniques

513

The results obtained are discussed mainly about how these learning techniques along with molecular-based structures can be utilized to perform learning tasks and help us with different complex tasks as the recognition of a multi-dimensional signals.

17.2. Molecular-based Topology Molecular networks refer to complex bio-inspired networks with molecular-based connectivity. Molecular connectivity can be represented using structural formulas and molecular models, and is determined by a chemical definition of the molecular topology, which is a spatial arrangement of atoms and chemical bonds. In this chapter, we demonstrate the structure of cell-binding B oligomer of Verotoxin-1 molecule (VT-1) produced using X-ray diffraction intensities, with resolution 2.05 ˚ A, obtained from E. coli 51 and shown in Figure 17.2. This molecular conformation has been also demonstrated as an example with an excitable automata model in terms of Boolean gates realization through interacting repeated patterns of excitation that show higher-dimensional transformations can be illustrated.52 Following52 adaptation of description and modeling of proteins, the molecular-based crystalline structure of Verotoxin-1 molecule is converted to a non-directed graph that is integrated into the 3D space, where each atom of the molecular structure is considered as the chemical bonds represent the edges between atoms. As a result, the structure of the protein used determines the topology of the graph we are studying. The molecular structure has 2992 atoms and 2831 chemical bonds, where the degree distribution of atoms ranges from 1 to 5. The longest shortest path of the structure has a length of 2774 atoms, and the mean shortest path between any two nodes is 2230 atoms.52 Verotoxin-1 is represented by a graph A = {V, E, C}, where V is a set of nodes, E is a set of edges, C is a set of Euclidean coordinates of nodes form V . Let u(s) be nodes of V that are connected with a node s ∈ V by edges from E, they correspond to atoms connected by bonds with atom s. We call a set of nodes u(s) hard-neighbors of the

page 513

August 2, 2021

514

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

K.-A. Tsakalos, G. Ch. Sirakoulis and A. Adamatzky

Figure 17.2. Verotoxin molecule. CPK coloring, color convention for distinguishing individual atoms, the structure of the molecule is determined in Ref. [51].

node s because the coupling atoms are determined by the molecular structure of the verotoxin-1 molecule. Verotoxin molecule is folded in 3D space, therefore an atom of A can interact with other atoms which are not coupled with chemical bonds. Let w(s) be nodes of A that are at a distance not exceeding ρ, in Euclidean space, from the node s, which we call as soft neighbors of the atom s because they are determined by the 3D structure of verotoxin molecule and not by the molecular bonds. Therefore, we also refer to the term of the Euclidean cut-off distance ρ, which defines Euclidean areas in the 3D space in which each atom can interact with other nearby atoms that are not coupled through the molecular structure. As a minimum Euclidean cut-off distance, we have chosen ρ = 3˚ A, which is more than twice the average Euclidean distance between two hard coupled neighbors (1.42 ˚ A).

page 514

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

Unsupervised Learning Approach Using Reinforcement Techniques

515

The network configuration is now formulated and the integration of a neuromorphic model to each atom is following in order to adjust molecular-like dynamics, which could enable interactions between atoms. These atoms could function as CA cells or biological neurons that receive signals, adjust to their environment and interact with their coupling atoms. In this way, atoms could be referred as finite-state machines which update their “state” depending on their neighborhood’s activity.

17.3. Artificial Neural Networks The well-known artificial neural networks (ANNs) are classified into three generations. The first generation of neural networks involves McCulloch and Pitts neurons, which tend to respond based on the binary system at distinct values of “0” and “1”. Neural networks such as Perceptron, Hopfield networks, Boltzman machines and multilevel networks with threshold units serve as examples of first generation neural networks. In the second generation, continuous activation functions are used, such as sigmoid, polynomial or exponential functions, which have continuous responses in the range of (0.1). Thus, the second generation networks require fewer neurons compared to the first generation due to the indistinguishable outputs. Multilevel perceptron neural networks are included in the second generation networks. The third generation of neural networks consists of bioinspired spiking neurons. Spiking neural networks (SNNs), called the third generation of neural networks, consists of bio-inspired spiking neurons and are characterized by their biological plausibility through the proper modeling of biological neuron dynamics. SNNs are a powerful tool for simulating complex neuromorphic network processing, including both computing and learning abilities, through plasticity. As a result, SSNs are utilized as a powerful computational tool to a wide range of applications such as image detection, function detection, image classification, and to solve various complex recognition tasks.53, 54 In SSNs, information is encoded based on the exact timing of spikes. These spikes depend on the neuron dynamics which are

page 515

August 2, 2021

516

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

K.-A. Tsakalos, G. Ch. Sirakoulis and A. Adamatzky

properly depicted in mathematical equations and as such various neuron models are proposed in the literature. Each neuron model is oftenly divided into three parts. The first concerns the input stimulations, the second the dynamics after receiving stimulation and the third part is related with the excitation of the neuron. Most neuron models are threshold based, that describe the excitations of the neuron as a production of an electrical impulse called spike. Spikes are the language of neurons by which neurons are stimulated and interact with their neighbors. A set of pre-synaptic neurons create voltage spikes which are transmitted through synapses to post-synaptic neurons. Each synapse consists of a synaptic weight and the spikes that are transmitted are escalated based on this weight. So, each postsynaptic neuron receives several spikes, which can be treated as percussion functions that allow a stimulation input current to them. This stimulation input current is calculated through the addition of the corresponding synaptic weights. So, these stimulations trigger the post-synaptic neurons in which their internal neuron dynamics are affected. Neuron dynamics can be considered as the internal state of neuron. If neuron’s internal dynamics cross a specific threshold then it is considered that the neuron fired. This abrupt change of neuron dynamics is called state transition. In correspondence, this enables us to introduce three kinds of states, the resting state, the excited state and the refractory state, respectively. As abovementioned, the first two states describe the neuron dynamics when a neuron receives stimulus, alter its dynamics and produce spikes accordingly. The refractory state is a state after the excitation of the neuron. In this state neurons cannot receive any input stimulations for a specific refractory period embedded in the neuron model. 17.4. Neuron Model The development and the usage of mathematical and computational models to simulate biological neuronal responses has enabled the development of both in silico biological network simulations and research on artificial neural computer networks. In this chapter, the

page 516

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

Unsupervised Learning Approach Using Reinforcement Techniques

517

combination of the simple Izhikevich neuron model is combined with the state transition control, which is well known in the field of CA. 17.4.1. Simple Izhikevich model The simple Izhikevich neuron model55 is capable of producing several different spikes or burst patterns observed in biological neurons by parameterizing only a set of four variables in the following 2D differential equations system: v˙i (t) = 0.04vi2 (t) + 5vi (t) + 140 − ui (t) + Ii (t),

(17.1)

u˙i (t) = a(bvi (t) − ui (t)),

(17.2)

with a threshold-based reset of the neuron dynamics as follows: vi (t) → c (17.3) if vi (t) ≥ Vthresh ui (t) → ui (t) + d, where v represents the membrane potential of the neuron in mV and u represents the membrane recovery variable, which indicates the activation of K + and inactivation of N a+ ionic currents that provide the appropriate negative feedback to v. The parameters a, b, c and d are dimensionless and concern various neuromorphic characteristics as the time-scale of recovery variable, the sensitivity to sub-threshold fluctuations, the after-spike reset value of the membrane potential v and the after-spike reset value of the membrane recovery variable u accordingly. The variable t refers to the simulation time that is distinguished in [1 ms] scale. When the membrane potential exceeds the potential threshold of Vthresh = 30 mV it is assumed that the neuron is excited and the membrane potential with the recovery variable are initialized according to Equation (17.5) and thus the neuron is assumed to go into a refractory state. Finally, parameter I represents the stimulation of coupling neurons and external stimulations according to the following equations: Ii (t) =

N

Io Sij + Iiin (t),

(17.4)

j∈Ci

Sij (t) = wij δ(t − tf ),

(17.5)

page 517

August 2, 2021

518

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

K.-A. Tsakalos, G. Ch. Sirakoulis and A. Adamatzky

where the variable Sij (t) takes the value of the synaptic weight wij between the pre-synaptic neuron j and the post-synaptic neuron i, the synaptic weight wij varies within the range of (0, 1) and is multiplied by a current amplitude Io to give the appropriate interconnection current value, which is of the order of pA. The equation is based on the logic that sums up the synaptic weights of the pre-synaptic neurons that fired and also adds an external excitation current to neuron Iiin (t) to produce different kinds of neuromorphic bifurcations observed in biological neurons.56

17.5. Excitation Reinforcement In this manner, with a view to address the need of controlling this kind of bifurcations, reinforcement techniques are introduced. These reinforcement techniques consist of state transition rules that describe the interactions between the neurons and are defined as a novel form of stimulation control. Such type of reinforcements is widely used for simulations in the field of CAs where neurons are considered as finite-state-machine processors and different transition rules between states are considered regarding the overall states of the neighboring neurons. Consequently, the impact of state transition rules on each neuron is based on the overall behavior of their coupled neurons or neighbors and determines if the neuron can be stimulated or not. In the following, we briefly analyze the transition rules utilized in the presented study. 17.5.1. Majority-rule Majority-rule is a CA-inspired decision rule, which determines the stimulation based on the activity of the majority in the neighborhood.57 More specifically, each neuron receives stimulation when the majority of its neighboring neurons have been excited. In this rule, if the majority of the neighborhood of a certain neuron has produced a spike, then the neuron is able to get stimulation of them, otherwise the neuron is considered disconnected from the network.

page 518

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

Unsupervised Learning Approach Using Reinforcement Techniques

519

17.5.2. Game-of-Life rule This proposed rule is inspired from the Conway’s GoL CA rule where cells are considered “dead” or “alive”. Here, we alter the cell’s state as rest or excitement in order to integrate this rule into neuron dynamics. In this way, each neuron can be stimulated depending on its state and the activity of neighboring neurons. If the neuron has produced a spike, it receives stimulation currents just in case two or three neighboring neurons have been excited, otherwise it gets stimulated when exclusively three neighboring neurons have fired. In any other case, either less or more than three coupled neurons have been fired, the neuron dynamics are not affected by the stimulation current. 17.6. Unsupervised Learning Recurrent networks offer rich neuromorphic dynamics through the spatio-temporal interactions with electrical signals between neurons that can be exploited using any kind of reinforcement learning. This kind of learning can be considered as the adjustment of synaptic weights according to neuron dynamics and the overall neuromorphic activity that occurs in neighboring neurons.58 Except that, regarding the reinforcement unsupervised learning, we introduce the terms of reward and penalty for synaptic plasticity of which reward involves the increase of the synaptic weight and as a consequence of the enhancement of the interaction between neurons, while the penalty term concerns the weakening of the synaptic weight accordingly. In either cases, the synaptic weights adjust depending on the previous synaptic weights and a specified learning rate. In the view of the foregoing, we introduce three novel reinforcement learning techniques based on the neuromorphic activity of the related neuron and its coupled ones. 17.6.1. Majority-rule learning Similarly to stimulation control, we introduce majority-rule learning in which the synaptic weights are affected whether the majority of

page 519

August 2, 2021

520

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

K.-A. Tsakalos, G. Ch. Sirakoulis and A. Adamatzky

coupled neurons have been excited. In case of simultaneous excitation with the neighborhood majority, the synaptic weight is rewarded and in any other case, either the neighborhood majority has been excited and the neuron is not or vice versa, the synaptic weight is penalized. Lastly, if the neuron as well as the neighborhood majority do not excite then the synaptic weights remain the same. Briefly, neurons are reinforced to follow the activity of their neighborhood as a selforganizing technique. 17.6.2. Game-of-Life learning Following this approach, we introduce the GoL learning rule, which is based on the concept of state transition rules for Game-of-Life already described above. In short, if the neuron fires simultaneously with two or three coupled neurons then the synaptic weights are rewarded while others are penalized. Also, if the neuron is in a resting or refractory state (i.e., has not been excited) and exactly three neighboring neurons have been excited then the synapses between those neurons are rewarded; otherwise, they are penalized. 17.6.3. Hebbian learning Finally, the well-known Hebbian learning was imposed. Hebbian learning is considered as the most biological plausible neuro-inspired learning method.59 It is also known as the Hebb’s rule60 in neuroscience: neurons that fire together wire together. In order to perform such an activity, synaptic weight between neurons that excite together is being rewarded, while when they are not synchronized, it is penalized. This means that the more neurons excite simultaneously, the stronger the synaptic weight becomes. So, in a bigger picture, in order to learn something, making clusters of neurons that fire all at once could help us with various neuromorphic applications such as classification tasks. 17.7. Training and Classification For reasons of statistical completeness, we have applied all the aforementioned combinations of stimulation control and unsupervised

page 520

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

Unsupervised Learning Approach Using Reinforcement Techniques

521

learning methods for the same random external stimulation currents and for the same random initial synaptic weights of the molecular network. It is important to notice here that we have split the simulation time into training and testing phase, where during the first phase we have supplied different random external stimulation for each neuron sampled from a normal distribution with a mean value μ = 0, and with variance σ = 0.8. Despite that, it should be also mentioned that the distribution of synaptic weights has been extracted from a random distribution in the range of (0, 1). In this chapter, we have examined throughout analytical simulations the corresponding transition rules presented in the following Table 17.1 and the order in which the results are presented is in accordance with the proposed combinations. The enhancement of biological plausibility of the proposed neuromorphic network considers two types of neurons with different neuromorphic bifurcations in their neuron dynamics that is biologically described through different types of chemical messengers called neurotransmitters that define excitation or inhibition between neurons. Therefore, we employ two kinds of neurons, namely excitatory neurons and inhibitory neurons.61 In particular, excitatory neurons cause excitation in their coupled neurons, whereas inhibitory neurons act restrictively and decrease the neuron dynamics to prevent excitation. This means that firing of excitatory neurons acts cumulatively, while inhibitory neuron’s firing acts decreasingly in the stimulation current of coupled neurons. Despite that, excitatory and inhibitory neurons differentiate the effect of external stimulation on the neuron, with excitatory neurons being more sensitive to external stimulus than inhibitory ones. Table 17.1. Proposed combinations of stimulation control and reinforcement learning methods. #

Stimulation control

1 2 3 4

Game-of-Life-rule Majority-rule Majority-rule Majority-rule

Learning method Majority learning Game-of-Life learning Majority-rule learning Hebbian learning

page 521

August 2, 2021

522

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

K.-A. Tsakalos, G. Ch. Sirakoulis and A. Adamatzky

Moreover, the proposed classification of neurons was not performed randomly. More specifically, the oxygen and hydrogen atoms from the molecular structure have been particularly classified as inhibitory neurons taking into consideration the ratio of excitatory to inhibitory neurons 4:1 as in the cerebral cortex.62 Their spatial arrangement is usually located in the branches of the main chain of the structure and, in this way, they do not act harmfully on the signals propagation in the molecule. To simulate this behavior in the Izhikevich model so as to achieve the necessary biological analogy, the values of the model parameters for both types of neurons are listed in Table 17.2. In this work, external stimulation that affects each neuron is used and considered as the input of the neuromorphic system. We define this input vector with random external stimulations in the range of (0, 1) to affect network dynamics. According to Izhikevich model for a regular spiking neuron, a neuron can produce Andronov– Hopf bifurcations, that is, produce spikes with constant firing rates, through an stimulation current of 50pA, which could be either external stimulus or the stimulation from its coupled neurons. In this way, concerning unsupervised learning, a synchronized activity of certain clusters of neurons is reinforced by utlization of a random stimulation input. Through training phase the synaptic weights between neurons under the molecular structure are updated according to the corresponding rules that have been already set. Moreover, in each interval,

Table 17.2. Different simple Izhikevich neuron model parameters adapted for different neuronal activity between excitatory and inhibitory neurons. Simple Izhikevich model parameters Neuron type Excitatory neuron Inhibitory neuron

a

b

c

d

0.02 0.01

0.20 0.25

−65 −100

2 8

page 522

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

Unsupervised Learning Approach Using Reinforcement Techniques

523

each neuron of the molecular network is stimulated through external stimulus which is randomly initialized, as mentioned above, and remains constant during the simulation.

17.8. Results It is perceived from the diagrams that for the first 500 ms of the simulation, the learning phase takes place, while for the rest of the time, the phase of examination takes place. It should be noted that the criteria that complete the training process are defined by the respective task. Here, an arbitrary training time of 500 ms is defined, whereby it is distinguished in 1 ms iterations of updates on simple Izhikevich model. The testing and training phases differ only in the fact that no further synaptic weight adjustments take place. In this way, the behavior of the network regarding the activity of neurons in certain clusters of neurons is easily observed. Another useful technique is the initialization that occurs in the training and testing phase. This was performed exclusively for sake of two phases separation by resetting the initial neuron dynamics of each neuron of the network. As shown in the simulation results, neuron dynamics from certain neighborhoods are obtained to highlight the activation of synchronized neuron clusters. This indication address the requirement of learning taking place. This synchronized activity is caused by the chain-based structure of the molecule as well as its spatial arrangement due to protein folding. However, the connectivity can be modified according to the demands of each application as is often associated with connectivity adjustments in the cerebral cortex. In Figures 17.3(a)–17.3(c) we obtain neuron dynamics of certain three neurons that are coupled. Such dynamics, as depicted in Figure 17.3, indicate that the synchronization of certain neurons based on the activity of its neighborhood is feasible through training. These neurons were selected due to the neuron dynamics of the neuron No. 424. Moreover, the global neuronal activity that occurs in the network, that is, the firings or spikes produced from each neuron during the simulation, is also visualized as presented in

page 523

August 2, 2021

524

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

K.-A. Tsakalos, G. Ch. Sirakoulis and A. Adamatzky

(a) Neural activity of certain neurons

(b) Network Neural activity

(c) Network total firings

Figure 17.3. Network simulation employing the GoL-rule as stimulation control and as reinforcement learning the Majority-rule learning method.

Figures 17.3(c), 17.4(c), 17.5(c), and 17.6(c), respectively. These network dynamics are quantified through a temporal firing rate during the simulation by demonstrating the overall spikes that occur in the network, as also shown in Figures 17.3(b), 17.4(b), 17.5(b), and 17.6(b).

page 524

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

Unsupervised Learning Approach Using Reinforcement Techniques

(a) Neural activity of certain neurons

525

(b) Network Neural activity

(c) Network total firings

Figure 17.4. Network simulation employing the Majority rule as stimulation control and as reinforcement learning the GoL learning method.

By employing the first simulation scenario, namely, the combinations of GoL as stimulation control (SC) with the Majority-rule as learning method (LM), or the second scenario, that is, Majorityrule as SC with the Majority-rule as LM, the molecular-based

page 525

August 2, 2021

526

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

K.-A. Tsakalos, G. Ch. Sirakoulis and A. Adamatzky

neural network is investigated and the obtained results are presented in Figures 17.3, and 17.4, accordingly. In both scenarios, presented in Figures 17.3(a), and 17.4(a), the first neuron has a constant firing rate and this combination of control and learning shows that neighboring neurons are notably affected as they do not fire either during the learning or the testing phase, respectively. Also, Figures 17.3(c), and 17.4(c) show that there are clusters of neurons, which have higher firing rates; however, learning does not enhance clusters that remain constant through time. Obviously, this is confirmed by the results of Figures 17.3(b), and 17.4(b) where no learning takes place, as the network activity remains constant throughout the training phase as the overall spiking activity fluctuates abruptly. Next, we have employed the combination of the Majority-rule as SC and the Majority-rule as LM, as shown in Figure 17.5(a). Here, different neuron dynamics are obtained from the specific neuron 424. It is observed that neuron exhibits certain bursts during the training phase, while its neighboring neuron is gradually activated. As a consequence, after the constant input stimulation, during the examination phase they produce fast spiking or bursting behavior. Additionally, in Figure 17.5(c) the overall spiking activity of the network is highlighted presenting the progressive formation of new bursting patterns across the network. The progressive formation of new oscillation patterns can also be observed in Figure 17.5(b), which is related to the overall firing rate of our network. During training phase, a linear increase of the firing rate is recorded and during the testing phase a formatted bursting cluster of neurons fluctuate steadily. Lastly, Figure 17.6 illustrates the case of combination majorityrule as SC and Hebbian-rule as LM. As shown in Figure 17.6(a) this combination, unlike the others, provides the burst formation in both neighboring neurons. The generalization of the phenomenon, as illustrated by Figures 17.6(c) and 17.6(b), is a strong evidence of learning ability as occurs in a general context to the network.

page 526

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

Unsupervised Learning Approach Using Reinforcement Techniques

(a) Neural activity of certain neurons

527

(b) Network Neural activity

(c) Network total firings

Figure 17.5. Network simulation employing the Majority-rule as stimulation control and as reinforcement learning the Majority-rule learning method.

The rapid spiking patterns formation present in Figures 17.6(c), and 17.6(b), points out a more effective correlation of the neuronal activity to the stimulation input, and, for this reason, this combination can be considered the most promising.

page 527

August 2, 2021

528

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

K.-A. Tsakalos, G. Ch. Sirakoulis and A. Adamatzky

(a) Neural activity of certain neurons

(b) Network Neural activity

(c) Network total firings

Figure 17.6. Network simulation employing the Majority-rule as stimulation control and as reinforcement learning the Hebbian learning method.

17.9. Discussion According to the simulation results already presented, various useful conclusions can be drawn. Initially, we have observed that when GoL is applied as SC or as LM, no appropriate results are obtained and therefore it is deemed unsuitable for learning purposes. This is easily

page 528

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

Unsupervised Learning Approach Using Reinforcement Techniques

529

understood from the fact that no significant change between the data has been received during the training and the testing phase. According to the network dynamics that take place, there is no clear separation between the data both of training and testing phase and therefore we conclude that any kind of learning in the neural network does not occur when we stimulate it with a constant stimulus input. On the contrary, the results obtained by applying the majority method and the Hebbian method as learning methods, respectively highlight their ability to train the proposed neural network, which means that these learning rules enhance the ability of the network to distinguish specific stimulation inputs. This can be further observed from the temporal firing rate of the network. In first place, it is noticed that for the neighborhood of neuron 424, after a period of time, Hopf bifurcation occurs from all the three neurons. Then, the raster diagram that represents the firings of the network demonstrates the creation of new bursting patterns across the network; this means that the network presents some kind of classification of each input stimulus. In summary, according to the results already presented, the ability of molecular networks to be trained using unsupervised learning methods, namely the Majority-rule and the Hebbian-rule learning methods, is confirmed. The strongest evidence of learning concerns the temporal firing rates that occur in the network. Both the Majority-rule and the Hebbian-rule learning methods show an increase in the total firings that occur on the network during the learning phase. This also explains the fact that new firing patterns are created in the network mentioned above. Molecular-based structures are able to be trained through unsupervised learning depending on the creation of new bursting clusters of neurons in the network during the training phase. After the training phase, we have observed intense firing activity of the network which has the form of bursting behavior and this is a strong evidence that classification of specific multi-dimensional signals could be demonstrated through proper recognition of the neuron clusters. In general, although we have observed satisfactory learning data with the Majority-rule learning method, we still get the best results

page 529

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

530

b4205-v1-ch17

K.-A. Tsakalos, G. Ch. Sirakoulis and A. Adamatzky

with the Hebbian learning method. We can distinguish these two learning methods based on the temporal firing rate activity of the network as the Hebbian-rule learning method presents greater classification in network activity when compared to the majority method through the higher overall clusters created in the network. Acknowledgement This work has been supported by the Hellenic Foundation for Research and Innovation (H.F.R.I.) under the “First Call for H.F.R.I. Research Projects to support Faculty members and Researchers and the procurement of high-cost research equipment grant” (Project Number: 3830). References 1. M. Mitchell, Complex systems: Network thinking. Artif. Intell. 170(18), 1194–1212 (2006). 2. F. Kepes, Biological Networks. vol. 3 (World Scientific, 2007). 3. G. Bagler, and S. Sinha, Network properties of protein structures. Physica A: Statist. Mech. Appl. 346(1–2), 27–33 (2005). 4. E. E. Schadt, Molecular networks as sensors and drivers of common human diseases. Nature 461(7261), 218–223 (2009). 5. J. R. Nitschke, Molecular networks come of age. Nature 462(7274), 736–738 (2009). 6. G. Amitai, A. Shemesh, E. Sitbon, M. Shklar, D. Netanely, I. Venger, and S. Pietrokovski, Network analysis of protein structures identifies functional residues. J. Molecular Biol. 344(4), 1135–1146 (2004). 7. B. Chakrabarty, and N. Parekh, Naps: Network analysis of protein structures. Nucleic Acids Res. 44(W1), W375–W382 (2016). 8. A. Gursoy, O. Keskin, and R. Nussinov, Topological properties of protein interaction networks from a structural perspective (2008). 9. T. Sienko, A. Adamatzky, and N. Rambidi, Molecular Computing (MiT Press, 2003). 10. J. Neumann, A. W. Burks et al., Theory of Self-reproducing Automata. vol. 1102024 (University of Illinois press Urbana, 1966). 11. S. Wolfram, Theory and Applications of Cellular Automata (World Scientific, 1986). 12. P. Di Lena, and L. Margara, Computational complexity of dynamical systems: The case of cellular automata. Inform. Comput. 206(9–10), 1104– 1116 (2008). 13. S. Wolfram, A New Kind of Science. Wolfram Media (2002). ISBN 1579550088. URL http://www.wolframscience.com.

page 530

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

Unsupervised Learning Approach Using Reinforcement Techniques

531

14. R. P. Feynman, Simulating physics with computers. Int. J. Theor. Phys. 21(6/7) (1982). 15. G. C. Sirakoulis, and S. Bandini, Cellular automata. In Proceedings of the 10th International Conference on Cellular Automata for Research and Industry, ACRI 2012 Santorini Island, Greece, September 24–27, 2012, vol. 7495, (Springer, 2012). ISBN 9783642333491. 16. J. Was, and G. C. Sirakoulis, Cellular automata applications for research and industry. J. Comput. Sci. 11, 223–225 (2015). ISSN 1877-7503. 17. A. Adamatzky, K. Szacilowski, Z. Konkoli, L. C. Werner, D. Przyczyna, and G. C. Sirakoulis, On buildings that compute a proposal. In From Astrophysics to Unconventional Computation (Springer, 2020), pp. 311–335. 18. G. Kalogeropoulos, G. C. Sirakoulis, and I. Karafyllidis, Cellular automata on FPGA for real-time urban traffic signals control. The J. Supercomput. 65(2), 664–681 (2013). 19. T. Toffoli, Cellular automata as an alternative to (rather than an approximation of) differential equations in modeling physics. Physica D: Nonlinear Phenomena 10(1–2), 117–127 (1984). 20. T. Giitsidis, N. I. Dourvas, and G. C. Sirakoulis, Parallel implementation of aircraft disembarking and emergency evacuation based on cellular automata. The Int. J. High Perform. Comput. Appl. 31(2), 134–151 (2017). 21. I. G. Georgoudas, P. Kyriakos, G. C. Sirakoulis, and I. T. Andreadis, An FPGA implemented cellular automaton crowd evacuation model inspired by the electrostatic-induced potential fields. Microprocess. Microsyste. 34(7–8), 285–300 (2010). 22. M. Mitsopoulou, N. Dourvas, I. G. Georgoudas, and G. C. Sirakoulis, Cellular automata model for crowd behavior management in airports. In Int. Conf. Parallel Processing and Applied Mathematics (Springer, 2019), pp. 445–456. 23. A. Tsiftsis, I. G. Georgoudas, and G. C. Sirakoulis, Real data evaluation of a crowd supervising system for stadium evacuation and its hardware implementation. IEEE Syst. J. 10(2), 649–660 (2015). 24. M.-A. I. Tsompanas, and G. C. Sirakoulis, Modeling and hardware implementation of an amoeba-like cellular automaton. Bioinspir. Biomimet. 7(3), 036013 (2012). 25. M.-A. I. Tsompanas, G. C. Sirakoulis, and A. I. Adamatzky, Physarum in silicon: The greek motorways study. Nat. Comput. 15(2), 279–295 (2016). 26. N. Dourvas, M.-A. Tsompanas, G. C. Sirakoulis, and P. Tsalides, Hardware acceleration of cellular automata physarum polycephalum model. Parallel Process. Lett. 25(01), 1540006 (2015). 27. N. I. Dourvas, M.-A. I. Tsompanas, and G. C. Sirakoulis, Parallel acceleration of slime mould discrete models. In Advances in Physarum Machines (Springer, 2016), pp. 595–617. 28. R. Mayne, M.-A. Tsompanas, G. C. Sirakoulis, and A. Adamatzky, Towards a slime mould-FPGA interface. Biomed. Eng. Lett. 5(1), 51–57 (2015). 29. M. Madikas, M.-A. Tsompanas, N. Dourvas, G. C. Sirakoulis, J. Jones, and A. Adamatzky, Hardware implementation of a biomimicking hybrid ca. In International Conference on Cellular Automata (Springer, 2018), pp. 80–91.

page 531

August 2, 2021

532

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

K.-A. Tsakalos, G. Ch. Sirakoulis and A. Adamatzky

30. N. I. Dourvas, M.-A. Tsompanas, and G. C. Sirakoulis, Implementing cellular automata bio-inspired algorithms on field programmable gate arrays. 31. I. Georgoudas, G. C. Sirakoulis, E. Scordilis, and I. T. Andreadis, On-chip earthquake simulation model using potentials. Nat. Hazards 50(3), 519–537 (2009). 32. I. Vourkas, and G. C. Sirakoulis, FPGA based cellular automata for environmental modeling. In 2012 19th IEEE International Conference on Electronics, Circuits, and Systems (ICECS 2012), (IEEE, 2012), pp. 93–96. 33. P. Progias, and G. C. Sirakoulis, An FPGA processor for modelling wildfire spreading. Math. Comput. Modell. 57(5–6), 1436–1452 (2013). 34. V. G. Ntinas, B. E. Moutafis, G. A. Trunfio, and G. C. Sirakoulis, Parallel fuzzy cellular automata for data-driven simulation of wildfire spreading. J. Comput. Sci. 21, 469–485 (2017). 35. A. Adamatzky, Computation of shortest path in cellular automata. Math. Comput. Modell. 23(4), 105–113 (1996). 36. S. G. Akl, Computing shortest paths with cellular automata. In Shortest Path Solvers. From Software to Wetware (Springer, 2018), pp. 181–198. 37. M.-A. I. Tsompanas, N. I. Dourvas, K. Ioannidis, G. C. Sirakoulis, R. Hoffmann, and A. Adamatzky, Cellular automata applications in shortest path problem. In Shortest Path Solvers. From Software to Wetware (Springer, 2018), pp. 199–237. 38. V. Ntinas, R.-E. Karamani, I.-A. Fyrigos, N. Vasileiadis, D. Stathis, I. Vourkas, P. Dimitrakis, I. Karafyllidis, and G. C. Sirakoulis, Cellular automata coupled with memristor devices: A fine unconventional computing paradigm. In 2020 International Conference on Electronics, Information, and Communication (ICEIC), (IEEE, 2020), pp. 1–4. 39. N. I. Dourvas, G. C. Sirakoulis, and A. I. Adamatzky, Parallel accelerated virtual physarum lab based on cellular automata agents. IEEE Access. 7, 98306–98318 (2019). 40. N. I. Dourvas, G. C. Sirakoulis, and A. Adamatzky, Cellular automaton belousov–zhabotinsky model for binary full adder. Int. J. Bifurc. Chaos. 27(06), 1750089 (2017). 41. V. G. Ntinas, B. E. Moutafis, G. A. Trunfio, and G. C. Sirakoulis. Gpu and fpga parallelization of fuzzy cellular automata for the simulation of wildfire spreading. In R. Wyrzykowski, E. Deelman, J. Dongarra, K. Karczewski, J. Kitowski, and K. Wiatr, eds. Parallel Processing and Applied Mathematics (Springer International Publishing, Cham, 2016), pp. 560–569. 42. J. Conway, The game of life. Sci. Am. 223(4), 4 (1970). 43. S. Wolfram, Cellular Automata and Complexity: Collected Paper, (Reading, Wesley, 1994). 44. S. Wolfram, A New Kind of Science, vol. 5 (Wolfram media Champaign, IL, 2002). 45. M. G. Kechaidou, and G. C. Sirakoulis, Game of life variations for image scrambling. J. Comput. Sci. 21, 432–447 (2017). ISSN 1877-7503. 46. A. Adamatzky, Game of Life Cellular Automata, vol. 1 (Springer, 2010).

page 532

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch17

Unsupervised Learning Approach Using Reinforcement Techniques

533

47. P. Rendell, Turing universality of the game of life. In Collision-based Computing (Springer, 2002), pp. 513–539. 48. P. Rendell, Turing machine in conway game of life. In Designing Beauty: The Art of Cellular Automata (Springer, 2016), pp. 149–154. 49. C. Mead, Neuromorphic electronic systems. Proc. IEEE. 78(10), 1629–1636 (1990). 50. G. Indiveri, and S. Liu, Memory and information processing in neuromorphic systems. Proc. IEEE. 103(8), 1379–1397 (2015). 51. P. E. Stein, A. Boodhoo, G. J. Tyrrell, J. L. Brunton, and R. J. Read, Crystal structure of the cell-binding b oligomer of verotoxin-1 from e. coli. Nature 355(6362), 748–750 (1992). 52. A. Adamatzky, Computing in verotoxin. ChemPhysChem. 18(13), 1822–1830 (2017). 53. A. Tavanaei, M. Ghodrati, S. R. Kheradpisheh, T. Masquelier, and A. Maida, Deep learning in spiking neural networks. Neural Networks 111, 47–63 (2019). 54. S. Ghosh-Dastidar, and H. Adeli, Spiking neural networks. Int. J. Neural Systems 19(04), 295–308 (2009). 55. E. M. Izhikevich, Simple model of spiking neurons. IEEE Trans. Neural Networks 14(6), 1569–1572 (2003). 56. E. M. Izhikevich, and G. M. Edelman, Large-scale model of mammalian thalamocortical systems. Proc. Nat. Acad. Sci. 105(9), 3593–3598 (2008). 57. M. Mitchell, J. P. Crutchfield, R. Das et al., Evolving cellular automata with genetic algorithms: A review of recent work. In Proceedings of the First International Conference on Evolutionary Computation and Its Applications (EvCA’96), vol. 8 (1996). 58. R. S. Sutton, and A. G. Barto, Reinforcement Learning: An Introduction (MIT Press, 2018). 59. R. Kempter, W. Gerstner, and J. L. Van Hemmen, Hebbian learning and spiking neurons. Phys. Rev. E. 59(4), 4498 (1999). 60. D. O. Hebb, The Organization of Behavior: A Neuropsychological Theory (Psychology Press, 2005). 61. E. Izhikevich, Dynamical Systems in Neuroscience (MIT Press, 2007), p. 111. 62. R. Druga, Neocortical inhibitory system. Folia Biologica. 55, 201–217 (2009).

page 533

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch18

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 0018

Chapter 18

Intelligent Gamesourcing — Artificial Intelligence in Problem Solution by Game Playing Ivan Zelinka∗,§ , Jiri Arleth†,¶ , Michal Bukacek∗, and Tran Trong Dao‡,∗∗ ∗

Faculty of Electrical Engineering and Computer Science Department of Computer Science, ˇ VSB-TUO, 17. listopadu 2172/15, 708 00 Ostrava-Poruba, Czech Republic † Software Engineer, ABB s.r.o. 28. ˇr´ıjna 3348/65 Nov´ a Karolina Park 702 00 Ostrava, The Czech Republic ‡ Faculty of Electrical and Electronics Engineering, Division of MERLIN, Ton Duc Thang University, Vietnam § [email protected], ¶ [email protected], [email protected], ∗∗ [email protected] Gamesourcing is a new term coined by the English words game and crowdsourcing. The game means game, but the explanation of the term crowdsourcing is somewhat more extensive. It is a method in which the work (often very difficult for a computer, but relatively easy for a human being) is divided among many people, who, depending on the nature of the assignment, either complete it together or each completes it for himself. In the first case, all participants will participate in the final result, in the second case, only the best work will be selected. By connecting with the game, that is gamesourcing, we get a means to perform crowdsourcing on a slightly different basis. Namely, those players don’t even have to know that they are doing crowdsourcing at all.

535

page 535

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

536

b4205-v1-ch18

I. Zelinka et al.

18.1. Introduction 18.1.1. Gamesourcing In the last decade, a new phenomenon has begun to spread uncontrollably on the Internet — crowdsourcing, or the use of a large group of people for some creative activity. For example, drawing on ideas, suggestions, content or contributions, whether financial or professional. The use can be practically any: education (Wikipedia), research, healthcare, transport (accident reporting), marketing and advertising, donations, volunteering, etc. But even in the political sphere, the “wisdom of the crowds” has already found its application. After 2008, when Iceland went bankrupt due to the economic crisis, a new constitution was drafted, and its draft was made available for wide discussion online. Through debates and comments through social networks, Icelandic citizens were able to influence and correct the final form and wording of the Constitutional Charter.1, 2 Crowdsourcing is thus slowly becoming part of the life of each of us to some extent, without us often realizing it. This chapter deals with the description of one of its special forms, the so-called gamesourcing, where the means of the collective cooperation is playing games. 18.1.2. History The term “crowdsourcing” was first used in 2006,3 which could lead (given the rise of the Internet) to the mistaken assumption that it is an online business alone. However, the process of using the masses to collaborate on a task has worked since time immemorial, beginning with prehistory.4 At that time, it was only a matter of receiving advice and recommendations by the chief from his subordinates, but in a way, it was actually a kind of collective participation in achieving the goal (e.g., how to survive the winter). Thus, a particular form of crowdsourcing existed for tens of thousands of years before the omniscient media emerged, so it is no new thing. An example more similar to the current form of crowdsourcing is the case of 1714, when the British government dealt with safety in seafaring. The “latitude problem” made voyages very difficult, costing Her Majesty the Navy thousands of lives a year. The point

page 536

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Intelligent Gamesourcing

b4205-v1-ch18

537

was that at that time, sailors were able to determine their latitude without any problem (according to the height of the Sun), i.e. how far north or south they are from the equator. However, there was no reliable method to determine longitude, that is, how far east or west they sailed. The British Parliament has therefore announced a reward of £20, 000 for whoever comes up with a solution. In terms of today’s value, such an amount would be equivalent to several million US dollars.5 The complexity of such a task lay mainly in need of accurate time measurement. The pendulum clock was out of the question due to swaying in the waves, and the clock with feathers and gears was unreliable and inaccurate. The winner of the competition was John Harrison, the son of a carpenter. He came up with a nautical chronometer, that is an accurate vacuum-sealed pocket watch. After a 47-day voyage, they were only 39.2 s6 wrong, earning their inventor a hefty fee. This example beautifully shows one of the main ideas of crowdsourcing — innovation and creativity can come from anywhere. In 1879, it had been many years since the idea of creating a chronological dictionary of the English language. The original intention was to write a 4-part, 6400-page dictionary, containing the meanings of words in their chronological order, from the oldest quote in context to the most recent to document a change in meaning. It was to be mapped until the time of the Saxons, about 1,000 years ago. It was undoubtedly a very ambitious project, which could hardly be achieved only by one person. “It would be necessary to hire a team — moreover very large — composed of hundreds and hundreds of unpaid amateurs working voluntarily” (Ref. [7, p. 106]). Therefore, a summons was sent out inviting volunteers to choose a historical stage from which they would like to read, and then to send extracts of words used in that literature. In 1928, the project was completed, and the result was the Oxford English Dictionary. An incredible 6,000,000 readers took part.6 This dictionary is still being disseminated through online forms and, with more than 600,000 entries in the context of 2,500,000 quotations, is one of the largest and most important dictionaries ever.8 Let’s go back to the last century when we don’t have to go far at all for the result of the crowdsourcing project, which took place in

page 537

August 2, 2021

538

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch18

I. Zelinka et al.

1936. We practically see it daily. At that time, Toyota announced a tender to redesign its logo. They received a total of 27,000 proposals.9 The winner has survived to the carmaker to this day, and we can see him on each of her cars. Crowdsourcing has also been used in architecture for some time. In 1955, the Prime Minister of the Australian state of New South Wales, Joseph Cahill, announced a £5, 000 competition to design an opera house on the Gulf Coast in Sydney. A total of 233 proposals from 32 different countries were sent. The winning design by Danish architect Jorn Utzon foreshadowed the creation of one of the most iconic and innovative buildings of our time. This type of architectural competition continues to be widely used.10 Until the advent of computers, as we know them today, computers were called people whose job was to calculate mathematical calculations manually. They played a key role, especially during World War II.11 Each of them counted simple tasks within a team that was solving a more complex problem in this way. Their ant work brought advances in science, industry, and weapons, and eventually led to the invention of the machine computers by which they were replaced. Nevertheless, everything was still offline. The first attempts at online collaboration began in the 1960s, when hundreds and sometimes thousands of developers were able to participate in the development of open-source software.12 With the later massive rise of the Internet, completely new horizons for crowdsourcing opened up. The cooperation of the masses has suddenly become easier, faster and, above all, more accessible to a larger number of people. With the advent of the third millennium, websites were created that are not officially called crowdsourcing, but in essence, they are — for example, YouTube or Wikipedia. Efforts to use people for machine learning are also beginning to emerge — building intelligent systems using human perception and intelligence, such as the Open Mind Initiative.13 Volunteers provided answers over the Internet to questions that computers could not solve (“What’s in this picture?”). This was very close to gamesourcing, but there were two pitfalls — relying on the willingness of unpaid volunteers to invest their efforts and time, and there was no guarantee that

page 538

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Intelligent Gamesourcing

b4205-v1-ch18

539

their contributions would not be wrong. In 2006, Jeff Howe first used the term “Crowdsourcing” in his article14 and has since become a common word and increasingly familiar to the public. The first purely crowdsourcing projects began to appear, such as DesignCrowd (crowdsourcing the creation of graphic designs, logos, websites, etc.) or Digg (crowdsourcing aggregator of news and news). Crowdfunding has also begun to develop — the contribution of a larger number of individuals with smaller amounts to the target amount to finance an exciting project or product. With the growing trend of the gaming industry, its potential for crowdsourcing has been correctly estimated, and the very first Google Image Labeler (formerly The ESP Game) gamesourcing game, authored by Carnegie Mellon University professor Luis von Ahn,12 was launched under the auspices of Google in the same year. It was about describing pictures in a fun and catchy way. The number of similar acts has been slowly increasing since then, and awareness of gamesoucing is beginning to reach the general public. In 2016, this was mainly due to the game Sea Hero Quest, which was also advertised on domestic television stations. This is a game for mobile devices, which is used to help study dementia. Gamesourcing seems to be on the rise. It has already proven to be an excellent tool in several cases. However, its potential is far from being fully exploited.

18.2. Motivation What makes gamesourcing, that is, game-driven crowdsourcing, better than classic crowdsourcing? What are its advantages and benefits, and does it have any downsides? 18.2.1. Leisure The principle of crowdsourcing lies in the voluntary participation of contributors. Furthermore, this is the main stumbling block and the most limiting factor. The participants have to sacrifice their free time for activities that often do not even benefit them, which is

page 539

August 2, 2021

540

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch18

I. Zelinka et al.

an indigestible problem for many people in today’s hectic times. According to research by the Institute of Sociology of the Academy of Sciences of the Czech Republic in 2009, 6% of the Czech population had no free time on weekdays, a quarter only 1–2 h, a third 3–4 h, 18% 5–6 h and 16% 7 and more hours,15 which is based on a weighted average of about 3.75 h of free time per working day. On Saturday or Sunday, it was almost double in our country, about 7 h. Compared to European estimates, these figures are approximately the same.16 Moreover, that’s not much (Figure 18.1). Internet, however, when we look how the time is spent (in Europe and the Czech Republic), we notice that after watching TV and listening to music, the most common activities are the Internet and PC.16 The first line always describes the data for the Czech Republic, the second for European countries. Taking into account the fact that the number of Internet users is continuously growing (see Figure 18.2) and given the priorities in spending free time, we can boldly state that the online world is an ideal environment for crowdsourcing operations. The current number of internet users has already exceeded 3.5 billion, representing almost 50% of the world’s population.17 In 1995, it was only 1%. Between 1999 and 2013, the number of Internet

Figure 18.1. Free time on weekdays of the population of the Czech Republic and free time on weekends of the population of the Czech Republic, own processing.

page 540

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Intelligent Gamesourcing

Figure 18.2.

b4205-v1-ch18

541

Number of Internet users from 1993 to 2016, own processing.

users increased 10-fold. An Internet user is a person who has access to the Internet through a computer or mobile device from home. 18.2.2. Game playing According to a report by the gaming company, Spil Games,18 there were 1.2 billion people playing games on Earth by 2013. Of these, 700 million played online. Thanks to the increasing availability of the Internet, online games began to come to the fore. However, their success is mainly due to a combination of simple operation and complicated operation, which represents an exciting mix of fun and challenge. Playing games has become a mainstream affair that is no longer burdened by the stereotype of a lonely teenager locked in front of the world in a darkened room and sitting in front of a computer. It is no longer just a youth activity, for example, 41% of women and 37% of men over the age of 45 play. This is not a big difference compared to the 15–24 age category, where the proportion of players is 47% women, and 54% men.18 The development of mobile technologies has undoubtedly contributed to the spread of games to older generations. 52.7% of current Internet users access it from a mobile device.19 So you can play (online) almost anywhere today. On the bus on the way to work, while waiting for a meeting, in the toilet, during free time on holiday. The average time spent in this way is approximately 40 minutes a day.18 Furthermore, why do so many people play games? It’s all about confiscation, but that

page 541

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch18

I. Zelinka et al.

542

Figure 18.3.

Percentage of genres in 1975–2012 (Source: Geek.com).

alone wouldn’t guarantee such success to video games. An easily acquired sense of satisfaction with success, relaxation and a quick escape from the everyday hustle and bustle also play an essential role here. Motivation to play repeatedly varies slightly based on gender. While men are more concerned with competitiveness, achieving the best possible position above others, women are more concerned with exploring all possibilities, completing the whole game.20 The popularity of individual game genres across the years is clearly shown in Figure 18.3 created on the basis of the percentage of published titles of a given genre in a given year. We see that puzzle games are gaining more and more popularity, that is, various logic games and puzzles, which are mainly the domain of mobile games. The time and effort involved in playing online games have enormous potential for crowdsourcing solutions to problems that current technology is short on. Many things that are trivial to a person are almost unsolvable for a computer. If gamesourcing could take advantage of those 28 billion minutes (approximately 500 million hours) of online gaming per day, it could lead to big things. Furthermore, it wouldn’t cost the player anything. They wouldn’t

page 542

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Intelligent Gamesourcing

b4205-v1-ch18

543

even have to know that they are helping to solve something with their “rest” time. 18.2.3. Paradigm There is no official definition of how gamesourcing should be structured and what it should contain. However, Luis Von Ahn (with Laura Dabbish) of Carnegie Mellon University, who is probably the creator of the first outsourcing game The ESP Game, considers in one of his articles12 how to generalize the process of creating such games. It stands for GWAP — Games With A Purpose, which can be loosely translated as games with utility. Based on his many years of experience in this field, he has compiled three “templates” that describe three possible game structures that he says can be applied to any computational problem. Each template defines the rules of the game and the conditions of winning so that it is in the best interest of the players to perform the required calculations. To this end, he introduced a set of principles that increase the fun and attractiveness of games with utility, while ensuring the quality of their outputs. Finally, he introduced metrics that assess the success of a gamesourcing game in terms of the benefits gained per working hour of playing. 18.2.4. Structure The motivation of gamesourcing is based on three main factors: — an ever-increasing percentage of the population with Internet access — some tasks are unsolvable by computers, but easy for humans — people spend much time playing computer games. Unlike traditional crowdsourcing, gamesourcing does not rely on altruism or financial motivation to push people to collaborate. He bets on their desire to have fun. The gamesourcing game must first and foremost be fun, and only as a side effect does any useful calculation be solved when playing it. The game mechanism must, therefore be closely connected to the computational problem in an input–output manner. The player is presented with input, and he uses it to create the output we need. The game must also have a clearly defined goal (conditions

page 543

August 2, 2021

544

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch18

I. Zelinka et al.

of victory) and rules defining what players may and may not do. These elements are very important because they direct the player to take steps to solve the related problem and at the same time should ensure the correctness of the output even if the player tries to produce bad output. Based on these facts, three general types of game outsourcing structures can be defined.12 • Output match: In the beginning, players are randomly selected to each other. In each round, each of them is presented with the same input, and they must create outputs based on it. The game pushes the player to strive for the same outputs as his teammates. They cannot communicate with each other in any way and do not see their outputs with each other. The winner of the round will be the pair that produces the same output first. It doesn’t have to happen at the same time; just one of the players has to give the same output as a teammate before him. For example, The ESP Game12 was built on this principle, where the inputs were images, and the outputs were keywords. With the impossibility of communication and mutual anonymity, it was, therefore, best for the players to write outputs that were related to the picture. The game did not say that it was necessary to enter the correct keywords. It’s just that you need to try the same one that your teammate enters. In other words, “Think like everyone else.” This game structure (output matches) also ensures the correctness of the outputs, as a pair of completely independent sources agree on the result. • Inversion: Peekaboom, Phetch or Verbosity games, for example, are built in this way.12 In the beginning, players are randomly assigned to each other, and one of them is selected as a “descriptor”, the others will be “guessers”. The descriptor receives input, on the basis of which he produces outputs, which he then sends to the guessers. According to the information provided, they are trying to guess the entry. For example, in Verbosity, the input is the word and the outputs the facts connected to it. For example, at the “milk” input, the outputs could be “it’s white” or “it’s cow related.” The winner is the one who hits the entrance. But the winner will also be the descriptor. This structure of the game,

page 544

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Intelligent Gamesourcing

b4205-v1-ch18

545

therefore forces him to try to create the most accurate outputs possible, so that he has the greatest chance of someone guessing the original input based on them. It is a principle similar to the classic children’s game of questions. The form of the game can also be such that the descriptor sees the guessing tips and can evaluate them in the style of “burns/winter”. It is also advisable to ensure that the roles of wrestler and descriptor are alternated regularly in the event of different entertainment. • Input match: This form is represented, for example, by the game TagATune.12 In the beginning, players are randomly associated with each other, and each is assigned some input. The game knows if the inputs are the same or not, but the players do not. Subsequently, players must create outputs, according to which at some point they must judge whether their inputs are identical or not. In TagATune, the input is recordings/sounds, and the outputs are tags, feelings, descriptions. The victory is achieved by the one who correctly determines whether he has a common entry with someone or, on the contrary, a different one. To avoid accidental betting, this type of game should severely penalize incorrect answers or assign a bonus score for correct answers in a row. All of the above structures can be in single player form. This can be done with the help of an opponent’s AI, which would behave based on a predefined scenario. In the case of input or output matching games, it is relatively simple. The system could record their behavior (outputs and their timing) in a multiplayer game and then replicate this in a single-player game. In the case of inverse games, this is already more difficult, because it is necessary to respond dynamically to specific stimuli of the descriptor, or guards. 18.2.5. Fun Probably the most important aspect of (not only) gamesourcing games is fun. The outputs need to be obtained in such a way that the players have fun with it. Furthermore, when they enjoy it, they will play longer and longer and produce more coveted output. Specifically,

page 545

August 2, 2021

546

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch18

I. Zelinka et al.

a total of five basic principles of how to achieve greater fun in gamesourcing will be listed here, but in general, it can be said that the fun of the game is directly proportional to the level of challenge. • Time limit: By setting a time limit for creating output, a system of bonuses for faster player performances, etc. can be introduced. It is known from human psychology that a clearly defined and time-limited task leads to higher performance than easy and trivial tasks.21 However, the game must be well-calibrated, and the entered goal must be a challenge for the player within the required time limit. It is also advisable for the remaining time to be visible. • Scoring: Probably the most straightforward way to motivate a player is through rewarding each output he creates. The final score after each game tells the player how successful he has been and leads him to further attempts to try to improve his result. • Ranks: The rank system, where it is necessary to achieve a certain number of points/outputs to obtain a higher rank, is a principle that has proven itself on the game scene for a long time. Many players will only play to advance to the next title/rank, thus differentiating themselves from others. • Rankings: Compiling leaderboards for the most successful players can be a strong motivation for players. Everyone would like to see their name in a leading position, at least for a while. Besides, by scaling them (hourly, daily, monthly), it is possible to offer different difficulty to achieve such a result, which for some players can significantly extend their playing time. • Coincidence: Entries should be presented to players at random. Their difficulty will be variable, and the game will retain its charm for both veterans and novices. At the same time, this brings uncertainty into the game in the event of a time limit, because it will never be certain in advance how fast the task will be completed. Random selection of teammates also has a positive effect on the fun. It ensures the uniqueness of each game and supports the desire to play again.

page 546

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Intelligent Gamesourcing

b4205-v1-ch18

547

18.2.6. Credibility of outputs The basic structures of gamesourcing games mentioned above may not always provide sufficiently reliable results. If players agreed in advance that everyone would deliberately enter the same and incorrect outputs (when describing each of the images), which would be considered the winning result in the case of an output match game, the system would collect incorrect data. The following mechanisms should prevent similar types of collusion. • Random grouping: Gamesourcing games are created for hundreds, if not thousands of players playing at the same time and from different parts of the world. If players are grouped at random, they don’t know whom they’re playing with, so they can’t agree on cheating in advance. • Testing players: The player may from time to time be presented with an entry for which all correct outputs are already known. If a player produces an incorrect output for him, he becomes suspicious, and his other outputs may no longer be trusted. Depending on the number of such test inputs, the probability of a valid output can then be determined. For example, if half of all the inputs presented by the players were test inputs (and he would give the correct output for all of them), it would mean that the other outputs produced by him are 50. • Repetition: The game should be designed in a way that does not consider the output to be correct until it is reached by a certain number of players. It is thus possible to guarantee any probability of the reliability of the output. For example, in the case of an output match game accepting an output after n matches, each output is 50. • Taboo outputs: For problems where the input has a wide range of possible outputs, we may want to cover the whole set of outputs sufficiently. This can be achieved using the output tab. The game disables the most frequent correct outputs, and the players display them. This forces him to think in a different way and thus achieve another correct output, such as a new or less numerous one.

page 547

August 2, 2021

548

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch18

I. Zelinka et al.

18.2.7. Evaluation of success How to determine the success of a gamesourcing game? If two games solve the same problem, how to know which one is better? If we looked at them as algorithms, the decisive criterion would be complexity, that is, the number of steps to achieve the output. However, it may not be clear in the game what can be considered as one computational step. So a slightly different perspective is needed. Below are three approaches to evaluating the success of a gamesourcing game.12 Although they do not include some aspects important for games (such as virality or “word of mouth”), they are relatively reliable indicators of their usefulness. • Permeability: Basic throughput is defined as the average number of resolved problem instances (input–output pairs) per person-hour of playing. The principle of most games is to repeat the same task repeatedly, which gradually improves the player. If we want to take into account the learning and improvement (or deterioration) of a player, throughput must be calculated as the average number of solved instances of the problem by all players over a meaningful period of time.10 • Average playing time per life: Since this is a game, we have to take into account how much fun it is. We may have a high throughput, but what’s the point if no one wants to play the game? But the fun is not easy to measure. It depends very closely on the appearance and implementation of the game. At first glance, a trivial change to the user interface or scoring system is all it takes, and suddenly the fun can climb up. The ALP metric,12 or “Average Time to Live”, is the average of the time spent by all players who have ever played it. • Expected benefits: Once we know the average number of instances of the problem solved per person-hour of play (throughput) and the expected time spent playing our game (ALP), we can determine the benefit of each player. The magnitude of the expected benefit says how many on average we can expect solved instances of the problem by one player. We obtain it as the result of the product of permeability, and ALP.12

page 548

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Intelligent Gamesourcing

b4205-v1-ch18

549

18.2.8. Current status Gamesourcing is currently not widely used; from our point of view, it is mostly experimental work. The vast majority of such games consist of working with images, whether it is a simple determination of their content (eg identifying cells infected with malaria in the Malaria Training Game), transcribing photographed texts (e.g., diaries of World War I soldiers in Operation War Diary), or marking points of interest (e.g., trajectories of particles created after a collision at the giant particle accelerator at Higgs Hunters). In the overwhelming number of cases, however, these are not so much games, but rather visually nicely processed, but still purebred crowdsourcing. The following list contains games that deviate significantly from this average, especially in their processing, which is not so transparent and straightforward. The fact that the player would not even have to know that playing will help solve the problem is one of the most interesting features that gamesourcing can have and which is missing in ordinary crowdsourcing. The projects are listed in alphabetical order, and each describes a problem that solves how the game conceived it and the results it has achieved so far. 18.2.8.1. 1.4.1 Astro Drone This game was created by the European Space Agency (ESA) to investigate whether the distance to an object can only be determined by looking at a static image. The basic premise is the assumption that when approaching the object, there are fewer fluctuations in colours and textures. To measure these changes, the probability distribution of the presumed occurrence of different textures is used, on the basis of which the Shannon entropy is then calculated. A series of 385 image sequences showed that in 88.6% of cases, the entropy decreased when approaching an obstacle. However, the data used were not ideal, as most sequences were only simulations of zooming using single image scaling. The remaining one was created from hand-held camera recordings, which, however, led to a limitation of movement and location.22 The goal of the Astro Drone (Figures 18.4 and 18.5) project is to obtain a larger and more diverse amount of data for the analysis

page 549

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch18

I. Zelinka et al.

550

Figure 18.4.

Figure 18.5.

Parrot AR.drone (Source: Parrot.com).

Example of playing Astro Drone (Source: www.esa.int).

of this problem, which could be used for navigation and landing of space probes and robots. To participate, you need a Parrot AR.drone quadcopter (Figures 18.4 and 18.5), an iPhone or iPad and the Astro Drone mobile application, which can be downloaded free of charge. The game itself then uses drone cameras in augmented reality and currently contains two possible levels. The first simulates the docking of the module to the International Space Station and the second the flight of the Rosetta spacecraft through space debris to comet 67P/Churyumov– Gerasimenko. She must then launch the Philae module at the right

page 550

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Intelligent Gamesourcing

b4205-v1-ch18

551

Figure 18.6. Images taken during the flight (above) and their 5 × 5 px snippets (below) (Source: www.esa.int).

time to land safely on the comet. This process is then evaluated in the form of a score. During play Figures 18.6 and 18.10 shots are taken evenly (in terms of distance from the “comet” or “station”). If a player decides to count his score, he agrees to send ESA data at the same time. However, for privacy reasons, the images will not be sent in their entirety, but only as a network of 5 × 5 pixel snippets. In order for the agency to obtain some information about the structure of the environment in which the game took place, one of the images converted to black and white, passed through the Hanning window and transformed by a discrete Fourier transform, is sent. This ensures that the original image cannot be recovered while preserving information on its geometric structure.22 18.2.8.2. EteRNA In 2010, Carnegie Mellon University and Stanford University launched this gamesourcing project, which involves assembling different shapes of RNA molecules. The shape of the RNA molecule is crucial in many ways. By changing the shape, we can directly

page 551

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch18

I. Zelinka et al.

552

Figure 18.7.

Example from playing EteRNA.

change the behavior of the molecule, for example, close the problem gene so that it is no longer overwritten. It also plays a role in interacting with enzymes or in acting as enzymes themselves. Then they are ribozymes. Their ability to transform depending on chemical conditions makes them very useful. They could be “wrapped” in shape effective for wandering through our body, to unfold after entering the target cell. The goal of EteRNA is thus to control and develop procedures for the synthesis of RNA into the desired shapes, Figure 18.7.23 It is a browser game and takes place in such a way that the players are presented with a base and a shape into which they need to be arranged. The player must use the mouse to “mark” these bases as A (Adenine), C (Cytosine), U (Uracil), or G (Guanosine), and they will begin to merge and gradually form a pattern dynamically. Each of the two bases forms different strengths between them, and in addition, a limiting factor of the type “use a maximum of one G-C bond” is often added in the game. The aim is to achieve the desired shape with the lowest possible energy intensity, that is, with the weakest possible ties. The final score then follows from this. When a player solves a number of tasks, he unlocks the ability to create his own designs of molecule shapes for other players. Particularly interesting or promising works can be selected by the authors for

page 552

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Intelligent Gamesourcing

b4205-v1-ch18

553

real synthesis in the laboratory. The Folding @ Home program, which uses the computing power of PlayStation 3 volunteers to passively process the necessary data, is already dealing with the same problem of RNA assembly. However, thanks to the human factor, EteRNA shows significantly better results. Not faster (one instance of this problem can be processed by a powerful computer in a minute, it takes a person-days), but more accurate, which was proved by comparison in the actual assembly of the computer and humandesigned molecules in laboratory conditions. Human suggestions were 99% better.23 EteRNA has more than 130,000 users, and the creators have already managed to develop several new algorithms based on the collected data. The first of these — EteRNAbot, then surpassed all other algorithms in 19 of 20 cases.23 18.2.8.3. Foldit Proteins are part of almost every process in our body. They break down sugars into energy, process food, send control signals to the brain, transfer nutrients through the blood. Many of them function as enzymes, that is, catalysts for chemical reactions (not only beneficial ones) that would not otherwise be possible. There are thousands of different proteins, many of which are responsible for the disease, but they all have one thing in common — they consist of a long chain of amino acids. There are only 20 species, and they always form a long uniform line with small branches. The protein is created by joining these lines into one main (backbone). The key is the shape in which this line is tangled. The protein tries to form the most compact object possible, so as a result, some amino acids are inside and some outside and also differently adjacent to each other. This then gives the function of each protein.24 In the desktop (Windows/OSX/Linux) game Foldit, which came from the heads of scientists from the University of Washington, you try to achieve the highest possible score by bending and shaping the protein. Care must be taken to ensure that the gaps between the individual bodies are not too small, but not too large so that the total protein is as compact as possible, orange (hydrophobic) branches to hide as much inside as possible, blue (hydrophilic) again outside, the use

page 553

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch18

I. Zelinka et al.

554

of hydrogen bonds adds more points, etc. Everything takes place in a 3D environment and is controlled with a mouse, which you can bend, move and rotate individual parts of the protein. The aim of this project is to find hitherto unknown structures of some proteins and to describe the strategies that people used. This could make significant progress in predicting protein structures. For example, within three weeks, the structure of the protein responsible for the replication of the AIDS-like virus in monkey bodies was determined. Researchers have previously worked on this problem without success for ten years.25 Furthermore, there is also the possibility of helping to create completely new proteins, ”tailor-made” against a specific disease or for a specific purpose, etc. 18.2.8.4. Play to cure: Genes in space In this mobile game (Figure 18.8) from Cancer Research UK, you find yourself in the cockpit of a spaceship whose goal is to collect a valuable commodity called “Element Alpha”. For the purchased Element Alpha, you are rewarded with credits for which you can improve your ship in various ways, such as strengthening its engines

Figure 18.8.

Play to Cure: Genes In Space demo.

page 554

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch18

Intelligent Gamesourcing

555

or buying better lasers for shooting down asteroids, which can get in your way during missions. At the same time, you gain experience for your actions, for which you are then promoted to higher military ranks. However, the crucial moment of the whole game takes place just before the start of each mission. You are presented with a scan of the area showing the Clusters of Element Alpha, and you must plan a route so that you can fly through the largest possible sites. In reality, however, it is remapped data from the DNA chip. DNA consists of base pairs A, C, G and T arranged in a double helix. The sequence of bases is essential for the proper functioning of our body. If, for some reason, some parts of DNA are assembled differently than they should, genetic defects occur. The problem is that there are many of these differences and they can cause the loss or gain of various genes, not just the one we are interested in, and in this case the genes responsible for cancer. DNA samples from many cancer patients need to be compared to determine the appropriate segments. It uses DNA chips that can compare thousands of samples at once. But this raises another problem — a huge amount of data to process.26 There is software that can do this, but nothing compares to the perception of the human eye. Figures 18.9 and 18.10 show the output of the DNA chip. The pink horizontal band is part of the DNA under investigation, and any fluctuations indicate the number of extra segments given. We see here four such important areas (they are

Figure 18.9.

DNA chip data (Source: scienceblog.cancerresearch.org).

page 555

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

556

b4205-v1-ch18

I. Zelinka et al.

Figure 18.10. Data from the previous image transferred to the game (Source: scienceblog.cancerresearch.org).

marked). But it is necessary to determine exactly where they begin and where they end. And this is what the rocket route is planning before the start of the mission. The game converts this output nicely, and it’s up to the players to mark the given areas in order to get the most out of Element Alpha. Thus, playing this game greatly speeds up the analysis of the causes of various types of cancer and thus the research of new drugs against them. 18.2.8.5. Sea Hero Quest Dementia significantly worsens the quality of life and, according to expert estimates, will affect 135 million people by 2050. It is not a disease in itself. Dementia is a term used to describe a symptom that occurs when brain cells stop working properly. Damage to these cells impairs the ability of brain cells to communicate with each other. If brain cells cannot communicate with each other, normally, it affects the thinking, behavior, and feelings of the affected person.27 Although dementia mainly affects the elderly, it cannot be considered an integral part of ageing. He doesn’t choose dementia — it is a disease that can affect anyone, regardless of their origin, education, lifestyle or medical condition. There is no cure for dementia. While some procedures may improve the quality of life of sufferers, there

page 556

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Intelligent Gamesourcing

b4205-v1-ch18

557

is no treatment that can slow or stop diseases such as Alzheimer’s disease. This means that the disease worsens over time.27 The only hope remains the discovery of a new method of treatment. Player Sea Hero Quest becomes a seafarer who traverses the waters of the seas and oceans and takes pictures of their inhabitants. In doing so, he applies his perception and sense of orientation. Deterioration of the ability of spatial orientation is one of the accompanying phenomena of Alzheimer’s disease, which is 60–70% the cause of dementia. By playing the game, people will help scientists better understand this process and distinguish when impaired orientation is a natural consequence of ageing and when it is already a symptom of the disease. On this basis, it will then be possible to develop new and much more accurate diagnostic tests for dementia and to progress in its treatment.27 This is a game for mobile devices. There are lots of progressively unlocking levels with increasing difficulty. Before leaving, the player is presented with a map of the area and location of the buoys, which must pass through in the specified order. Then the map disappears, and the player is completely dependent on his memory. Depending on the speed of completing this task, he is then rewarded with stars, for which his boat can be beautified in any way. In another variant, you are taken to the destination, and you have to determine the direction of your starting point as the crow flies. After the completion of each round, the data are sent to University College London for analysis. Just 2 minutes of play will provide the same amount of data as 5 hours in a similar lab experiment. More than 2.5 million players have already taken part in the game and recorded a total of 63 years of playing time. 18.3. Application — People versus Differential Evolution in Search of the Shortest Path One of the most common uses of computer game is to find the shortest path that can be understood like the problem of a travelling salesman within a small group of cities. Papera discussing this topic28 a

http://jaec.vn/index.php/JAEC/article/view/292.

page 557

August 2, 2021

558

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch18

I. Zelinka et al.

with the possibilities of comparing the ascertained solutions of a given problem of human intelligence and evolutionary algorithms, is one of numerous papers on that topic. Here we report only briefly its content. Human intelligence is represented by mobile game players programmed for the Android operating system, by their conduct while playing the game, and by the achieved results. Evolutionary algorithms are represented by differential evolution, where the best possible parameter estimation will be sought based on the player’s results. This will provide results of a quality comparable to human players. Another task is to verify whether this setting is suitable for all mazes and whether people or the differential evolution are better at searching. You will hardly meet a person who has never played a computer game whether on his or her phone or on a computer. People enjoy filling the blank spaces in their free time by resting while playing computer games and, what’s more, they even organize tournaments in this activity. According to Syracuse University’s online MBA program, 2019, it becomes increasingly popular every year to monitor computer game players, and certain areas are even becoming as popular as watching the classic sports tournaments. Playing computer games is discussed to be possibly added among the new sports disciplines of the eSports category at the Summer Olympic Games in 2024 which will take place in Paris.29 The most common issue of computer games as well as the general information sciences is to look for the shortest path through a field of obstacles. This problem could be generalized as searching for the shortest path through a maze where we need to find our way from point A to point B. The solution comprises various algorithms based on various principles.30 The simplest solution is to use brute force where the area is searched through widthwise. This solution can be optimized by trying to do a widthwise search but in the assumed direction. The most sophisticated and complex solution is using evolutionary algorithms. There are various options, and the limiting factor is always the required accuracy of results and how time-demanding is the search. In this chapter, we will be comparing

page 558

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Intelligent Gamesourcing

b4205-v1-ch18

559

the conduct of people in a computer game with the use of the differential evolution algorithm. We will set collections of solutions of the shortest paths that can be found by computer game players and compare them with the solutions ascertained by the differential evolution in DE.28 The aim is to find a suitable setting for DE which could lead to a relatively easy provision of a solution of the highest quality comparable with the solution provided by real people playing the games, who can practically see the solutions in front of them within small mazes. We will verify whether the same DE setting can be applied to various kinds of maze. At the same time, we will determine whether real people or DE can provide better research outcomes. The entire work is divided into two tasks. The first one is to programme a game application for users of mobile phones with the Android OS with a server part that should be recording the individual outcomes. Another task is to create a solution by using DE, which will be searching for computer solutions. Both solutions will be compared and, thanks to the experience gained from this comparison, DE will be set in a better way. In order to have it more popularised and available, the game “Travelling Salesman” has been programmed better for mobile phones with the Android operating system 18.13,28 was used actively for creating the game. Android app “Travelling Salesman” can be installed from Google Play server.b Upon the installation and initiation of the game, the players enter their nickname, by means of which they will be compared to other players and the same or a better solution will be searched for. The resulting game which the players had to take in order to collect all their jackpot is recorded in the logs at the server page. The game consists of 10 levels with progressively increasing difficulty as the searched area keeps enlarging. Figures 18.11 and 18.12 compares graphically the first and the last level generated. Upon starting the game on an Android device, the device is connected to the application server, and the maps for given game b

https://play.google.com/store/apps/details?id=cz.bukacek.travellingsalesman.

page 559

August 2, 2021

17:32

560

Figure 18.11. processing).

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch18

I. Zelinka et al.

Comparison of Level 1 and Level 10 (Source: Custom graphic

tasks are downloaded along with the highest scores attained for the given levels. The players will see this way the highest score attained which they either attempt to exceed or at least attain the same score. The course of the game is saved into the log and sent to the server upon completion of the given round where it is saved. Humans are represented by a group of 10 people aged 7-60. Each person plays each level at least once. The result of the comparison of human conduct on the same local path with the DE conduct is also worth mentioning. DE is set in such a way that the final path length, containing the already visited places, is penalized proportionally to the number of such repeated visits. For instance, if we are supposed to get from point A to point B, as displayed in Figures 18.13 and 18.14, and we wish to visit points 1, 2 and 3 on the way as well, we have two options to do that: the path following Figures 18.13 and 18.14 DE, affected by the traversing and assessment, will reach the best path following Figure 18.13, as the path following Figure 18.14 is penalized for visiting city two repeatedly. As opposed to that, humans tend to select the variant following Figure 18.14. It does not matter which variant is selected, as both paths are of the same length, both figures display a

page 560

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Intelligent Gamesourcing

Figure 18.12. processing).

b4205-v1-ch18

561

Comparison of Level 1 and Level 10 (Source: Custom graphic

part from city 1 to city 2 of the same length and two identical parts between cities 2 and 3. We may say that the first human solution is not bad at all, because people use their common sense when being at a crossroads and they do not select the paths between the cities randomly, as reported in Ref. [28]. The given area can be passed in different ways. A thinking person will select the shortest local path subconsciously and solve the passage through the entire map as a compound of the shortest paths.

page 561

August 2, 2021

562

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch18

I. Zelinka et al.

Figure 18.13. Variant (A) Local path from point A to point B (Source: Custom processing — app screenshot).

From the comparisons, it emerges that the humans mostly use an algorithm to find the shortest connection which they attempt to pass in such an order that will make the total path as short as possible. The intuitive human solution is correct and fast with small mazes. Rather than using the element of a chance, people think rationally, which is evident especially with low levels. Our gaming levels can be divided into three types depending on their size. Low levels (small mazes of level 1 and 2), medium

page 562

August 2, 2021

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Intelligent Gamesourcing

b4205-v1-ch18

563

Figure 18.14. Variant (B) Local path from point A to point B — with repetition (return to an already visited place).

levels (3–7) and high levels (8–10). As for the low levels, the results of people and DE are identical, both categories found short paths within the maze of the same length. DE evolution found more combination shapes with the same length. As for the medium levels, the results of DE are better than the ones of people. The maps contain some blind alleys with only one entrance, but there are not as many of those, so the DE with the same settings as for the low levels did well and found the best combination. DE is better at these levels. As opposed to

page 563

August 2, 2021

564

17:32

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch18

I. Zelinka et al.

that, people are obvious winners when passing through large mazes. There are more blind alleys there, and it is necessary to combine them better. Therefore, DE starts connecting the blind alleys which are not always neighbouring, but ALSO in a leap through the entire game plan. It results in frequent movements through passages that have already been taken. As for large mazes, people are able to keep in mind the entire path and to orient themselves, so when there is a place they have not visited next to the place they have just passed, they visit this place as well. This is certainly a better way to go than passing through the entire game map and return to the blind alley located to the currently neighbouring one. As opposed to that, when passing small game maps, DE needs a relatively large number of evolution cycles, a large population to offer a verifiable and the shortest path. We may say that people come to the same or even better conclusions than DE with less effort. As for DE, it mainly depends on whether there are blind alleys within the maze which can be accessed only via the one and only path, especially with respect to large mazes. If there are more such alleys, DE will search through them, find the shortest path and move to another place. It will search the area locally, but it cannot make a global connection. If there are more of these alleys within the maze, especially with large mazes, the paths must be combined, and new options tried. It emerges evidently from the measured values and observing the algorithm that it is always better to set a higher CR value (0-100), that is, approximately 80%, as it leads to a higher traversal number and parameter F (0-1) within 0.5-1. Thanks to these settings, the algorithm will try to pass more paths and more combinations of cities. There is not a single common DE setting for all kinds of mazes, as it depends on the size of the maze, the number of cities, the number of blind alleys and on the alleys neighbouring to the given field. A good solution can be created even in the first generations, and all solutions derived therefrom may not be necessarily as good. The gamesourcing is definitely an interesting alternative way to compute complex problems. For more we recommend to see Refs. [33] and [34].

page 564

August 4, 2021

11:55

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Intelligent Gamesourcing

b4205-v1-ch18

565

Acknowledgment This work was supported by the internal grant SGS SP2020/78 of VSB-TU Ostrava. References 1. S. Kuipers, Citizen Crowds Can Make a Big Impact on Elections. Daily Crowdsource. 2. L. P. Fish, Island tvoˇr´ı novou u ´stavu, on-line. Aktu´ alnˇe.cz. [Online] 14. ˇcerven 2011. http://zpravy.aktualne.cz/ekonomika/technika/island-tvori-novou-us tavu-on-line/r∼i:article:703969/ 3. M. Frojdova, Crowdsourcing. wikisofia. [Online] 10. January 2015. https:// en.wikipedia.org/wiki/Crowdsourcing 4. J. Handl, Crowdsourcing nen´ı novinka. Lupa.cz. [Online] 14. srpen 2010. http://www.lupa.cz/clanky/crowdsourcing-neni-novinka/ 5. The problem of longitude. Online library [Online] 2010. http://wol.jw.org/ cs/wol/d/r29/lp-b/102010170 6. S. Ellis, A History of Collaboration, a Future in Crowdsourcing: Positive Impacts of Cooperation on British Librarianship. crowd consortium. [Online] 2014. http://www.crowdconsortium.org/wpcontent/uploads/A-History-ofCollaboration-a-Future-in-Crowdsourcing-Positive-Impacts-ofCooperationon-British-Librarianship.pdf 7. S. Winchester, The surgeon of Crowthorne: a tale of murder, madness and the love of words. m´ısto nezn´ amé : Viking, 1998. 8. R. Sochorek, Oxford English Dictionary slav´ı 80 let. Sochorek. [Online] 14. ˇr´ıjen 2008. http://www.sochorek.cz/cz/pr/blog/1224018334-oxford-englishdictionary-slavi-80-let.htm 9. 5 Famous Logo Contests — Toyota, Google, Wikipedia & More! Design Crowd. [Online] 19. Listopad 2010. http://blog.designcrowd.com/article/2 18/5-famous-logo-contests--toyota-google-wikipedia-more 10. Crowdsourcing is Not New — The History of Crowdsourcing (1714 to 2010). ˇ ıjen 2010. http://blog.designcrowd.com/article/ DesignCrowd. [Online] 28. R´ 202/crowdsourcing-is-not-new–the-history-ofcrowdsourcing-1714-to-2010 ˇ 11. D. A. Grier, When Computers Were Human. YouTube. [Online] 6. Cerven 2005. https://www.youtube.com/watch?v=YwqltwvPnkw 12. L. Von Ahn and L. Dabbish, Designing Games With A Purpose. Carnegie ˇ Mellon University. [Online] Cervenec 2008. http://www.cs.cmu.edu/∼ biglou/GWAP CACM.pdf 13. D. G. Stork, The open mind initiative: An internet based distributed framework for developing. Johns Hopkins Whiting School of Engineering. [Online] 14. Listopad 2000. http://www.clsp.jhu.edu/events/the-open-mindinitiative-an-internet-based-distributed-framework-fordeveloping-david-g-sto rk-ricoh-silicon-valley/#.WH-RYFPhBhE

page 565

August 4, 2021

566

11:55

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-ch18

I. Zelinka et al.

ˇ 14. J. Howe, The rise of crowdsourcing. Wired. [Online] 1. Cerven 2006. https://www.wired.com/2006/06/crowds/ ˇ a, Leisure. Center for Independent Public Opinion Poll. [Online] 15. G. Samanov´ January 14, 2010. http://cvvm.soc.cas.cz/media/com form2content/docum ents/c1/a3718/f3/100994s OZ100114.pdf ˇ 16. J. Safr and V. Patoˇckov´ a, Spending free time in the Czech Republic in comparison with European countries. Center for Independent Public Opinion Poll. [Online] 2010. http://cvvm.soc.cas.cz/media/com form2content/docu ments/c3/a4013/f11/100119s Traveni%20volneho%20casu.pdf 17. Internet Users. Internet Live Stats. [Online] 25. Leden 2017. http://www.in ternetlivestats.com/internet-users/ 18. Spil Games, State of Online Gaming Report. [Online] 2013. https://www. fool.com/investing/general/2013/10/07/g2e-the-state-of-online-gaming-2013. aspx 19. J. Stevens, Internet Stats & Facts for 2016. Hosting Facts. [Online] Srpen. 11 2016. https://hostingfacts.com/internet-facts-stats-2016/ 20. N. Yee, 7 things we learned about primary gaming motivations From Over 250,000 gamers. Quantic Foundry. [Online] Prosinec. 15 2016. http://quant icfoundry.com/2016/12/15/primarymotivations/ 21. E. A. Locke and G. P. Latham, New directions in goal-setting theory, University of Baltimore. 22. Astro Drone Data, European Space Agency. [Online] 2017. http://www.esa. int/gsp/ACT/ai/projects/astrodrone scientific.html 23. G. Templeton, Players of crowdsourcing game beat supercomputers at designing RNA molecules. Geek.com. [Online] 29. Leden 2014. https:// www.geek.com/news/players-ofcrowdsourcing-game-beat-supercomputers-atdesigning-rna-molecules-1583401/ ´ 24. The Science Behind Foldit. foldit. [Online] 5. Unor 2017. http://fold.it/port al/info/about 25. A. Burke, Games that solve real problems: Crowdsourcing biochemistry. ˇ ıjen 2011. http://www.forbes.com/sites/techonomy/ Forbes. [Online] 27. R´ 2011/10/27/games-that-solve-real-problemscrowdsourcing-biochemistry/#69 5c81211990 26. J. Owens, Can the power of the public help personalize cancer treatment? Cancer Research UK. [Online] 1. Bˇrezen 2013. http://scienceblog.cancerre searchuk.org/2013/03/01/can-the-power-of-thepublic-help-personalise-can cer-treatment/ 27. Sea Hero Quest, [Online] 6. Unor 2017. http://www.seaheroquest.com/cs/ faq 28. M. Buk´ aˇcek, People vs differential evolution in search of the shortest path. J. Adv. Eng. Comput. 4(3), 207–217 (2020). 29. Intenational Olympic Committee (2019) https://www.olympic.org/news/de claration-of-the-8th-olympic-summit 30. D. Green, A. Aleti and J. Garcia, The nature of nature: Why nature-inspired algorithms work. In Nature-Inspired Computing and Optimization: Theory

page 566

August 4, 2021

11:55

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Intelligent Gamesourcing

31.

32. 33. 34.

b4205-v1-ch18

567

and Applications, Modeling and Optimization in Science and Technologies, (Springer, 2017). ISBN 9783319509198 K. V. Price, R. M. Storn, and J. A. Lampinen, Differential Evolution: A Practical Approach to Global Optimization (Natural Computing Series), 1st edn., (Springer, 2005). ISBN-13: 978-3540209508 Android developer page (2020) https://developer.android.com/ ˇ I. Zelinka, M. Nˇemec and R. Senkeˇ r´ık, Gamesourcing: perspectives and implementations. In: Simulation and Gaming. IntechOpen, 2017. I. Zelinka, et al. From Darwinian evolution to swarm computation and gamesourcing (2019). FromDarwinianEvolutiontoSwarmComputationandGamesourcing

page 567

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-index

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235726 bmatter

Index

additive quantum codes, 141 algorithmic causal relationship, 184 Algorithmic Information Dynamics, 184 analog computation, 16 Analogue Recurrent Neural Net, 60 Ant colony optimization, 479 antiport rule, 271 artificial morphogenesis, 23 artificial neural networks, 515 Asynchronous Game of Life, 377

Chemotaxis, 457 Church–Turing computability, 11 Church–Turing model, 11 Church–Turing thesis, 100 communicating P systems, 271 Compression sensitivity, 189 Computation with Slime Mould, 452 computational complexity, 121 computational efficiency, 285, 291 computationally hard problems, 296 computationally tractable, 299 computing power, 11 concurrent program formation, 434 configuration memory, 22 continuous computation, 16 criticality, 377 crowdsourcing, 536 Cuckoo search, 481

basis functions, 17 bat algorithm, 480 Bees-based algorithms, 479 billiard ball model, 47 biocomputers, 446 biological computation, 446 Bloch sphere, 104 Boolean satisfiability, 17 Bounded error Probability in Polynomial, 115 BPPSPACE, 63

deterministic Turing machines, 62, 114 Deutsch–Jozsa algorithm, 114 differential evolution, 477 digitizing the errors, 140 discrete computation, 16 DNA computing, 447 DNA-based non-deterministic Universal Turing machine, 450 dynamical system, 8

Calderbank–Shor–Steane (CSS) codes, 141 casual program formation, 435 Causal perturbation analysis, 181 cell-like membrane systems, 296 Cellular Automata, 509 cellular automaton, 377 cellular computation, 457 569

page 569

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

570 Eagle strategy, 478 Earth Mover’s Distance (EMD) algorithm, 178 ECG signal classification, 356 elementary triangular partitioned cellular automaton, 51 embedded computers, 15 embodied cognition, 13 embodied computation, 1 embodied robots, 238 entanglement, 107, 111, 144 enzymatic numerical P system, 281 error correction via symmetrization, 143 error-correcting codes, 140 exclusively quantum computations, 101 exponential speed-up, 401 extensibility, 299 Fault diagnosis, 304 field analogy method, 8 field computation, 18 finiteness condition, 116 Firefly algorithm, 480 flat maximal parallelism, 277 flower pollination algorithm, 481 frame of relevance, 11 fuzzy reasoning spiking neural P systems, 305 Game-of-Life learning, 520 Game-of-Life rule, 519 gamesourcing, 543, 549 general-purpose analog computers, 17 generalized Deutsch algorithm, 496 generalized scatter machine, 87 generic memory, 240 genetic regulatory network, 248 glider, 52 Hebbian learning, 520 hylomorphism, 6 in materio computation, 13 independent program formation, 434

b4205-v1-index

Index inductive mode, 433 inductive Turing machines, 407, 413 information processes, 8 information-bearing degrees of freedom, 7 Infware, 408 instruction machine, 410 instructionally symmetric Turing machines, 416 Integrated information theory, 177 interacting variables, 144 interval program formation, 434 Izhikevich neuron model, 517 Kolmogorov algorithms, 408 Kolmogorov complexity, 367 Kolmogorov–Chaitin complexity, 175 Majority-rule learning, 519 material computation, 13 Maxwell’s machine, 352 measurement problem, 490 membrane algorithms, 308 membrane computing, 261, 284 membrane controllers, 303 memristor, 354 Memristor Networks, 354 microscopic physical reversibility, 33 minimal parallelism, 270 minimum information partition, 177 mixed program formation, 434 molecular motors, 455 molecular networks, 508 multicellular machines, 458 multienvironment probabilistic functional extended P systems, 301 multiple realizability, 14 narrative influences understanding, 240 narrative logic, 239 narrative sensemaking, 244 NK model, 249 NK model of fitness landscapes, 248 non-information-bearing degrees of freedom, 7

page 570

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v1-index

Index Non-universality Theorem, 119 nonlinear dynamics with extended memory, 352 nonlinear dynamics with feedback, 353 nonlinear dynamics without feedback, 352 NP-complete, 115 NP-complete problems, 400 objective randomness, 364 operationally symmetric Turing machine, 417, 435 optimization, 469 optimization problems, 471 optimization spiking neural P systems, 307 oscillating program formation, 435 P colony, 276 P system with active membranes, 267, 285 P systems with input membrane, 291 parallel computing, 102 parallel program formation, 434 Particle swarm optimization, 479 path planning, 302 perturbation test, 172, 176 phase transition, 377 physical nature of information, 6 polynomial time complexity classes, 292 Popperian simulation, 238 Population Dynamics P systems, 299 population-based algorithm, 477 prior program formation, 434 probabilistic logic gate, 384, 386 probabilistic Turing machine, 63, 114 programmability, 9 programmable matter, 21 quantified Boolean formula, 296 Quantum codes, 140 quantum computation, 18 Quantum distinguishability, 147 Quantum Error-correction, 139

571 quantum exclusivity, 99, 100 quantum Fourier transform, 124, 127 Quantum Oracles for Randomness, 490 quantum parallelism, 112 quantum simulations, 113 quantum speed-up, 112 Quantum teleportation, 157 quantum Turing machine, 114 qubit, 104 random access machine, 413 randomness, 364 Rank-varying complexity, 126 Rapidly-exploring Random Tree (RRT) strategy, 304 reaction-diffusion systems, 20 reactive mode, 433 recognizer membrane system, 292 recognizer transition P systems, 286 recursive mode, 433 reflexive Turing machines, 407, 435 register machines, 287 regular interval program formation, 435 relational properties, 495 reservoir computer with a drive signal, 348 reservoir computing, 346 resource efficient computation, 342 reversible cellular automaton, 50 reversible computing, 32 reversible logic element, 33 reversible logic element with memory, 32, 34, 55 reversible sequential machine, 34 Rewriting Membrane Systems, 290 Robot Operating System, 304 rotary element, 35 satisfiability problem, 285 scatter experiment, 66 separation rule, 275 Sepsis prediction, 358 sequential dynamics, 9 sequential machine, 33

page 571

August 2, 2021

16:19

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

572

b4205-v1-index

Index

simulated annealing, 477 social robots, 242 Sparse oracles, 72 spiking neural networks, 515 stabilizer codes, 141 standard communication protocol, 71 state machine, 410 stochastic enhancements, 473 stochastic oracle, 62 story generator, 244 Structural machines, 408 Super-Turing computations, 115 symmetric instruction machine, 415 symmetric Turing machine, 407, 416 synthetic biology, 456

time-varying variables, 123, 139 tissue P system, 274 Tissue P systems with evolutional symport/antiport rules, 276 Tissue P systems with symport/antiport rules, 276 tissue-like membrane systems, 297 transduction, 15 transition P system, 265 triangular partitioned cellular automaton, 50 True randomness, 111 Turing machine, 100, 413, 512 Turing machines bounded in polynomial time, 63

the world influences narratives, 240 Theory of Measurement, 61 time-freeness, 270 time-varying complexity, 142 time-varying computational complexity, 122

understandability, 299 Universal Quantum Computer, 111 Unsupervised Learning, 519 Verotoxin, 513 von Neumann–Landauer limit, 6

page 572

Handbook of

Unconventional Computing VOLUME 2

12232 9789811235733v2 tp.indd 1

Implementations

3/8/21 9:48 AM

WSPC Book Series in Unconventional Computing Print ISSN: 2737-5218 Online ISSN: 2737-520X

Published Handbook of Unconventional Computing (In 2 Volumes) Volume 1: Theory Volume 2: Implementation edited by Andrew Adamatzky

Steven - 12232 - Handbook of Unconventional Computing.indd 1

2/8/2021 9:17:58 am

W S P C B O O K S E R I E S I N U N C O N V E N T I O N A L C O M P U T I N G In 2 Volumes

Handbook of

Unconventional Computing V O L U M E

2

Implementations

Editor

Andrew Adamatzky University of the West of England, Bristol, UK

World Scientific NEW JERSEY

•

LONDON

12232 9789811235733v2 tp.indd 2

•

SINGAPORE

•

BEIJING

•

SHANGHAI

•

World TAIPEI Scientific CHENNAI TOKYO

HONG KONG

•

•

•

3/8/21 9:48 AM

Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.

WSPC Book Series in Unconventional Computing HANDBOOK OF UNCONVENTIONAL COMPUTING (In 2 Volumes) Volume 1: Theory Volume 2: Implementations Copyright © 2022 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.

ISBN 978-981-123-503-0 (set_hardcover) ISBN 978-981-123-526-9 (set_ebook for institutions) ISBN 978-981-123-527-6 (set_ebook for individuals) ISBN 978-981-123-571-9 (vol. 1_hardcover) ISBN 978-981-123-572-6 (vol. 1_ebook for institutions) ISBN 978-981-123-573-3 (vol. 2_hardcover) ISBN 978-981-123-574-0 (vol. 2_ebook for institutions) For any available supplementary material, please visit https://www.worldscientific.com/worldscibooks/10.1142/12232#t=suppl Typeset by Stallion Press Email: [email protected] Printed in Singapore

Steven - 12232 - Handbook of Unconventional Computing.indd 2

2/8/2021 9:17:58 am

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-fm

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235740 fmatter

Preface

Progress resides on explorations. An exploration is the departure from, or challenge of, traditional norms, and assessment of other possibilities or choices available. This book uncovers a wide range of amazing paradigms, algorithms, architectures and implementations of computation often being far outside the comfort zone of mainstream, conventional, computing sciences. There are two volumes. One deals with mostly theoretical results and algorithms, another with experimental laboratory implementations or computer models of novel computing substrates. Distribution of chapters between volumes is often subjective because majorities of the chapters well fit into both “theoretical” and “implementation” categories. The first volume overviews topics related to the physics of computation, theory of computation, information theory and cognition, and evolution and computation. The boundaries between the topics are fuzzy and many chapters cover more than one topic. Physics of computation deals with analog computation, quantum computation, and field computation, programmable matter, artificial morphogenesis, reversible logic elements with memory, the interplays between the costs of realising physical computation and the computing capacity one can harvest from a physical device, and physical randomness in computation. Information theory and cognition is covered by the estimations of integrated information (information generated by a system beyond the information generated by its individual elements) using complexity measures and architectures for adding narratives for robot cognition, including an experimental

v

page v

August 3, 2021

vi

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-fm

Preface

scenario for investigating the narrative hypothesis in a combination of physical and simulated robots. The theory of computation is presented by a study of the computational power of the scatter machine bounded in polynomial space, a review of exclusively quantum problems (those which a Turing machine cannot solve), an extensive overview of the membrane computing concepts, and insights into symmetric automata and computation. Other important topics include evolving Boolean regulatory networks with variable gene expression times, swarm and stochastic computing for global optimization, unsupervised learning on bio-inspired topologies, and probabilistic logic gates in asynchronous Game of Life. Topics presented in the second volume deal with computing in chemical systems, novel materials, biopolymers, alternative hardware and unclassed topics. Chemical computing includes mimicking Boolean logic gates with enzyme reactions, chemical oscillatory reactions, computation with polymerase strand displacement reactions, chemical automata, and training of chemical classifiers. The chapters on novel materials introduce organic memristive devices for bioinspired application, sensing and computing with liquid marbles, colloid droplet processors, and evolving conductive polymer neural networks. The biopolymers part of the volume is about biomolecular motor-based computing and optics-free DNA microscopy. Alternative hardware includes a hybrid computer approach to train a machine learning system, computing with square electromagnetic pulses, logical gates in natural erosion of sandstone, fully analog memristive circuits for optimization tasks, and wave-based majority gates with cellular automata. Other (unclassified) exciting topics are information processing in plants with hormones, creative quantum computing, and digital logic with Minecraft. All chapters are self-contained and accessible by a reader with a basic training in exact sciences. The treatise in alternative computing appeals to everyone — from high-school students to university professors, from mathematicians, computists and engineers to chemists, biologists, and material scientists.

page vi

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-fm

page vii

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235740 fmatter

Contents

Preface Chapter 1.

v From Oscillatory Reactions to Robotics: A Serendipitous Journey Through Chemistry, Physics and Computation

1

Maria Lis, Shu Omuna, Dawid Przyczyna, Piotr Zawal, Tomasz Mazur, Kacper Pilarczyk, Pier Luigi Gentili, Seiya Kasai and Konrad Szacilowski Chapter 2.

Computing by Chemistry: The Native Chemical Automata

81

Marta Due˜ nas-D´ıez and Juan Pérez-Mercader Chapter 3.

Discovering Boolean Functions on Actin Networks Andrew Adamatzky, Stefano Siccardi, Florian Huber, J¨ org Schnauß and Jack Tuszy´ nski

vii

103

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Contents

viii

Chapter 4.

b4205-v2-fm

Implication and Not-Implication Boolean Logic Gates Mimicked with Enzyme Reactions — General Approach and Application to Signal-Triggered Biomolecule Release Processes

149

Evgeny Katz Chapter 5.

Molecular Computation via Polymerase Strand Displacement Reactions

165

Shalin Shah, Ming Yang, Tianqi Song and John Reif Chapter 6.

Optics-Free Imaging with DNA Microscopy: An Overview

181

Xin Song and John Reif Chapter 7.

Fully Analog Memristive Circuits for Optimization Tasks: A Comparison

193

F. C. Sheldon, F. Caravelli and C. Coffrin Chapter 8.

Organic Memristive Devices for Bio-inspired Applications

215

Victor Erokhin Chapter 9.

On Wave-Based Majority Gates with Cellular Automata Genaro J. Mart´ınez, Andrew Adamatzky, Shigeru Ninagawa and Kenichi Morita

271

page viii

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-fm

Contents

Chapter 10. Information Processing in Plants: Hormones as Integrators of External Cues into Plant Development

page ix

ix

289

M´ onica L. Garc´ıa-G´ omez and George W. Bassel Chapter 11. Hybrid Computer Approach to Train a Machine Learning System

303

Mirko Holzer and Bernd Ulmann Chapter 12. On the Optimum Geometry and Training Strategy for Chemical Classifiers that Recognize the Shape of a Sphere

343

Jerzy Gorecki, Konrad Gizynski and Ludomir Zommer Chapter 13. Sensing and Computing with Liquid Marbles

371

Andrew Adamatzky, Benjamin de Lacy Costello, Thomas C. Draper, Claire Fullarton, Richard Mayne, Neil Phillips, Michail-Antisthenis Tsompanas and Roshan Weerasekera Chapter 14. Towards Colloidal Droplet Processors

431

Alessandro Chiolerio and Andrew Adamatzky Chapter 15. Biomolecular Motor-based Computing Arif Md. Rashedul Kabir and Akira Kakugo

451

August 3, 2021

17:52

x

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-fm

Contents

Chapter 16. Computing with Square Electromagnetic Pulses

465

Victor Pacheco-Pe˜ na and Alex Yakovlev Chapter 17. Creative Quantum Computing: Inverse FFT Sound Synthesis, Adaptive Sequencing and Musical Composition

493

Eduardo R. Miranda Chapter 18. Logical Gates in Natural Erosion of Sandstone

525

Alexander Safonov Chapter 19. A Case of Toy Computing Implementing Digital Logics with “Minecraft”

541

Stefan H¨ oltgen, Thomas Fecker, Simon Pleikies, Natalie Wormsbecher and Silvio Divani Chapter 20. Evolving Conductive Polymer Neural Networks on Wetware

583

Megumi Akai-Kasaya and Tetsuya Asai Index

609

page x

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235740 0001

Chapter 1

From Oscillatory Reactions to Robotics: A Serendipitous Journey Through Chemistry, Physics and Computation Maria Lis∗ , Shu Onuma† , Dawid Przyczyna∗,‡, Piotr Zawal∗,‡, Tomasz Mazur∗ , Kacper Pilarczyk∗, Pier Luigi Gentili¶ , Seiya Kasai† and Konrad Szacilowski∗ ∗

Academic Centre for Materials and Nanotechnology, AGH University of Science and Technology, Krak´ ow, Poland † Research Center for Integrated Quantum Electronics and Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan ‡ Faculty of Physics and Applied Computer Science, AGH University of Science and Technology, Krak´ ow, Poland ¶ Department of Chemistry, Biology, and Biotechnology, University of Perugia, Perugia, Italy The continuous search for more efficient and energy-effective computing technologies drives researchers into various fields, seemingly not related to computing at all. It turns out, however, that system dynamics is the powerful computational medium, irrespectively of the physical nature of the system itself. This review presents a potpourri of systems and devices which share the common feature — they evolve in time, respond to the external signals and are thus suitable for information processing. It makes them useful for computational purposes and even for such demanding applications as autonomous robotics.

Det er vanskeligt at sp˚ a, især n˚ ar det gælder Fremtiden Karl Kristian Vilhelm Steincke (1880–1963)

1

page 1

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

2

b4205-v2-ch01

M. Lis et al.

1.1. Introduction Modern society is mostly driven by information. Entering the era of Big Data and the Internet of Things created the need for fast and energy-efficient devices for information acquisition, transmission, processing, and storage. Information storage and processing is, however, an extremely energy-demanding task. In 2016, data centers all over the world were estimated to have consumed over 416 TWh of electric energy, more than the United Kingdom during the same period (300 TWh).1 It can be estimated that ca 3.2% of total anthropogenic emission of carbon dioxide is a result of high performance computation. It should also be noted that the amount of energy consumed by supercomputing centres doubles every four years. This fact makes the search for novel, bioinspired and energyefficient computing a socially and environmentally important field. On the other hand, Nature has created an extremely complex and energy-efficient computing system based on wetware: a human brain. It comprises ca. 10–20 billion neurons in the cerebral cortex and additionally 55–70 billion neurons in the cerebellum.2 The structural complexity is a result of a large number of connections between neurons: up to 104 synaptic connections with other neurons for each neuron. Therefore, the network with 9 × 1010 nodes with 4 × 1014 dynamically weighted links is an uncopiable system, and despite great efforts with still unknown structure and capabilities.3 This amazing structure gives us creativity and intelligence. It also stimulates research of computing systems that mimic some odbrain’s functionalities, like speech or face recognition. This is a domain of artificial intelligence (AI). Most software and hardware AI implementations try to mimic the complexity of nervous systems using digital (binary) and rigorously deterministic algorithms, which is just a next step in the development of classical Turing machine.4 Artificial systems (hopefully) lack creativity, their goals are formulated by humans and embedded in hardware and/or software. Despite the fact that the development of a truly creative AI is usually likened to Pandora’s box, as can be seen in catastrophic science-fiction movies, there is an ongoing

page 2

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics

b4205-v2-ch01

page 3

3

intense research in this field. The concepts of AI have also crossfertilized other fields of science, including chemistry and material science. As a result, molecular logic,5–10 molecular computing,11, 12 unconventional computing 13, 14 and in-materio computing 15 have emerged as independent — but partially overlapping — fields of research. In materio computing can in principle use any kind of material, provided it shows some responsiveness to external stimuli, nonlinear properties, memory features, and internal dynamics.15 The computational efficiency of the material and complexity of problems that it can solve depend on the properties of the material under study. The most interesting are materials called by E.W. Plummer schizophrenic — materials that develop a multitude of different “personalities” upon physical stimulation. More technically, computational materials should have numerous ground states of different properties (optical, electrical) and the material should be easily switched from one state to the other.16 Very similar requirements are posed for materials applied in soft robotics,17 and more interestingly, some soft robotic materials and structures can be directly used as a computational medium.18–21 In this context materials and systems with internal dynamics (e.g., chemical oscillators) seem to be of special interest.22, 23 The key concept in neuromorphic and unconventional information processing is dynamics. Neural systems of human and animals are usually understood as an extremely complex system of partially coupled oscillators.24, 25 To make the whole system more convoluted, these oscillators, depending on their own history and the environment, may enter numerous distinct oscillatory modes.26 Therefore, dynamic systems are considered as the most promising unconventional computing platform.27–30 One of the unconventional computing paradigms — reservoir computing — very naturally exploits extensively dynamic features of various physical systems.31–33 Quite unexpectedly, reservoir computing, despite its complexity and inherent difficulties in construction of efficient reservoirs became a field of vigorous studies. Nobody expected reservoir computing!34 Dynamics

August 3, 2021

4

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

of complex systems is also naturally associated with delayed feedback, which originated from a finite velocity of signal propagation or processing. This delay can be understood as a form of memory of a dynamic system (1.1)35 : dx = f (x (t) , x (t − τ )) dt

(1.1)

where f denotes a nonlinear function and τ is the delay time. Because the current state of the system, that is, x (t0 ), depends on its own history, this dynamics in many cases can be controlled and harnessed for computing.33 The chapter focuses on computational properties of switchable materials (applied, e.g., in memristors) as well as oscillatory systems (chemical and electrochemical oscillators), especially in the context of reservoir computing (single node echo state machines) for sensing, signal processing, control, and autonomous robotics applications. 1.2. Systems Dynamics as a Computational Platform Human intelligence emerges from the complex structural and dynamical features of our nervous system. The cellular building blocks of our nervous system are neurons. The ultimate computational power of our nervous system relies on the neural dynamics and the behavior of neural networks.36 Every neuron is a nonlinear dynamic system.24, 37 Neurons can work in the either oscillatory, chaotic, or excitable regime. Every neuron responds to inputs by changing the value of its transmembrane potential. When a neuron is at rest, it usually has a negative transmembrane potential. The neuron is said to be hyperpolarized. If it receives inhibitory signals, the transmembrane potential becomes more negative, that is, the degree of hyperpolarization increases. On the other hand, if the neuron receives excitatory signals, its transmembrane potential becomes less negative, and the neuron depolarizes. When the excitatory signal is so strong that the transmembrane potential reaches a threshold value, then, the neuron fires an action potential. An action potential is a swift modification of the transmembrane potential of the neuron, which, at first, jumps from a negative to a positive value

page 4

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics

b4205-v2-ch01

page 5

5

(the neuron depolarizes), and then, becomes negative again (the neuron restores the hyperpolarization state) in a few milliseconds. The action potential propagates through the axon of the neuron as an electrochemical wave,38 until it reaches the synapses of the neuron where it is transduced in chemical signals. The chemical signals are the neurotransmitters that are released to the dendrites of other connected neurons. A neuron in oscillatory or chaotic regime fires action potentials periodically or chaotically, respectively. A neuron in the excitable regime can show three types of response: type I, II, and III. Under a constant excitatory input, type I and type II excitable neurons are capable of spiking repetitively across a broad range of frequencies depending on the intensity of the input. Type I (5–150 spikes/s) and type II (75–150 spikes/s) neurons are said to have “tonic” excitability. Type III excitability is said to be “phasic” because it responds with an analog signal without firing action potentials, unless it receives an extreme and short excitation. Phasic excitable neurons are particularly useful to encode the occurrence and time of rapid change in the stimulus. They can perform coincidence detection, as for inputs from two ears, with extraordinary, submillisecond, temporal precision. The computational power of neural networks hinges on the synchronization phenomena the neurons can originate. Development of alternative computing approaches requires exploration of various dynamic systems, harnessing their nonlinearity and exploration of memory features. Next step involves development of computing planform using various materials and substrates and exploring various physical phenomena. Up to now the material criteria for the best alternative computing system are not known, however the exploitation of nonlinearity and dynamics point towards reservoir computing as a paradigm of choice. Unfortunately, the generalized design principles for reservoirs are still lacking.39 Therefore, software reservoirs are usually optimized by brute force search,40 whereas in materio implementations require trial-and-error selection of materials and device configurations followed by laborious manual tuning. Fortunately, some tools to evaluate the performance of reservoirs have been developed. They include measures of their

August 3, 2021

6

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

nonlinearity, Lyapunov stability, computational capacity, and fading memory.39, 41–45 There exist chemical systems in liquid solutions, which can mimic all the possible dynamics of neurons. Examples of these systems are the well-known Belousov–Zhabotinsky reaction,46, 47 the Orb´ an reaction, the photochromic and luminescent compounds, in the presence or not of convective motions of the solvent.48, 49 The information is encoded through the intensities and the spectral compositions of the lights they transmit and/or emit.50 Other examples of systems which possess a great neuromimetic potential can be found among the electrochemical oscillators. Probably the best known electrochemical oscillator is the mercury beating heart — a spectacular experimental demonstration of periodic oscillations of liquid mercury contacted with oxidizing electrolyte. The first works in this field have been reported over two centuries ago independently by Alessandro Volta and William Henry,51 whereas the first more detailed study on this phenomenon was reported by Friedlib Ferdinand Runge.52 Other well-known electrochemical oscillatory systems have been reported by Koper53 and Lev.54 The indium/thiocyanate electrochemical oscillator is based on reduction of indium(III) by thiocyanate, which is affected by the potential applied to the electrode.53 Furthermore, this oscillator has an interesting feature — mix-mode oscillation and “chaotic” regions which can be found in biological systems as was mentioned earlier in this work. Since their discovery in the eighteenth century, enormous progress has been made not only in understanding their chemical nature processes and their nonlinear dynamics.55, 56 These systems are also postulated as computing27 and decoding systems.57 Although the appearance of oscillations in electrochemical systems may have different origins, the presence of coupling between reactive sites58 allows us to consider electrochemical oscillators as a future platform for artificial neuromimetic systems and devices. Furthermore, related phenomena are considered as a soft robotic platform59 with embedded intelligence60 and has already been used for control of robots.61

page 6

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics

b4205-v2-ch01

page 7

7

1.2.1. Wet oscillating systems The Belousov–Zhabotinsky (BZ) reaction is an oxidative bromination of malonic acid in aqueous acidic solution (1.2). + 2BrO− 3(aq) + 3CH2 (COOH)2(aq) + 2H(aq)

→ 2BrCH(COOH)2(aq) + 3CO2(g) + 4H2 O(l)

(1.2)

It is catalyzed by various metal ions or metal-complexes. Examples are cerium ions, tris-(1,10-phenanthroline)-iron(II) (ferroin) and tris(2,2 -bipyridyl)ruthenium(II), [Ru(bpy)3 ]2+ . When the cerium ions are selected, the BZ originates periodic large transmittance oscillations in the UV. In the presence of ferroin, the color of the solution changes periodically from blue to red and back to blue. Finally, with [Ru(bpy)3 ]2+ , the BZ seesaws from orange to green and back to orange, and it emits periodic luminescent flashes of red light. Furthermore, in the presence of [Ru(bpy)3 ]2+ , the BZ reaction is photosensitive to both the UV and the blue-green lights.62 Recently, the ferroin-based BZ system has been reported to respond to white63 and green61 light. The Orb´ an reaction is an oxidative degradation of thiocyanate by hydrogen peroxide in aqueous alkaline solution (1.3). + − 4H2 O2 + SCN− → HSO− 4 + NH4 + HCO3 + H2 O

(1.3)

It is catalyzed by copper ions, and when luminol is added, it gives rise to periodic chemiluminescent flashes of blue light. It is also sensitive to the blue light.64 The BZ and the Orbán reactions can mimic the dynamics of the pacemaker, tonic excitable, and chaotic neurons. Since they originate spikes of transmitted or emitted UV–visible radiation, they encode information through their optical signals.50 They can establish optical communication with luminescent and photochromic compounds. Luminescent compounds respond to an excitatory optical signal by emitting light. On the other hand, photochromic compounds react by changing their color and hence the spectral composition of the light they transmit.49 Both luminescent and photochromic compounds

August 3, 2021

8

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

are good models of the neurons in the phasic excitable regime because they respond in an analog manner to external optical stimuli. When the photochromic or the luminescent compounds are in the presence of hydrodynamic convective motion of the solvent, triggered by suitable thermal gradient, they can originate either periodic or chaotic optical signals.65–67 The optical communication between the oscillatory, chaotic, and excitable artificial neuron models allow to obtain “in-phase”, “out-ofphase”, “anti-phase”, and “phase-locking” synchronization phenomena analogous to those popping up in real neural networks.50 The photochromism finds further applications in neuromorphic engineering. Photo-reversible photochromic compounds allow to implement memory effects: if they are direct photo-reversible photochromes, UV and visible signals promote and inhibit their colorations, respectively. Furthermore, the intrinsic spectral evolution of every photochromic compound that transforms from one form to the other under irradiation generates either positive or negative feedback actions. The optical feedback actions produced by every photochromic compound act on both itself and other photo-sensitive artificial neuron models that are optically connected to the photochrome. Therefore, it is easy to devise recurrent networks by selecting photochromes with proper spectral properties. The feedback actions of every photochrome are wavelength-dependent because its photo-excitability, which depends on the product EΦ, where E is its absorption coefficient, and Φ is its photochemical quantum yield, is also wavelength-dependent. Therefore, photochromic compounds allow to implement neuromodulation. Neuromodulation is the alteration of neuronal and synaptic properties in the context of neuronal circuits, allowing anatomically defined circuits to produce multiple outputs reconfiguring networks into different functional circuits.68 1.2.2. Electrochemical oscillators Nature with all its processes, the more elaborate or completely simple ones, is an extremely vast source of human inspiration in laboratory and in everyday life. Looking for solutions that enable precise reflection of the reality, especially in such a complex aspect

page 8

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics

b4205-v2-ch01

page 9

9

as brain activity and learning, can lead humanity to a brighter future, better medical treatments, or self-learning computers. In this section the electrochemical oscillators will be presented as a contender for further considerations on their future use in neuromimetics and information technologies. The history of electrochemical oscillators dates back to 1828 when Fechner69 observed a new type of dissolution of silver on iron in an acidified solution of silver nitrate — nonlinear and periodic. Since the 19th century, electrochemical oscillators focused scientific attention not only as a curious abnormality but as a new type of system which can be the future of computing in materio 22 or mimicking biological systems.70 Figure 1.1 shows the oscillations recorded in various neural tissues,71 whereas Figure 1.2(a) various electrocorrosive oscillations.72 The similarities between these two oscillatory systems are clearly visible. After many years of investigations, the phenomena behind the electrochemical oscillations of two common features emerged from different types of mechanisms: (i) negative charge-transfer resistance (occurs in the potential window when oxidation rate decreases with more oxidation potential or reduction rate decreases with higher reduction potential) and (ii) external resistive component weakens the controlling potential (as a result of, e.g., forming passive film on the electrode). Moreover, if one looks closely on the curves depicting such oscillations then one will find characteristic regions of (1) rapid increase of current to maximum value, (2) current decrease with the slowing rate, (3) acceleration of current decreasing rate to the minimum value on which it remains until the whole process resumes. This basic model of electrochemical oscillators helps to divide interesting processes from each other and focus on steps underlying its origin.73 Concerning electrochemical oscillators as a group of processes presenting periodic and even chaotic phenomena, they can be divided into two main categories: anodic and cathodic. A vast majority of cases found in the literature are based on anodic polarization of metals associated with instabilities during passivation or due to porosity of oxide film/electrode surface.74 A general conclusion can be made: if in the system a region with non-steady processes can

August 3, 2021

(e)

10

(a)

17:50

(f) (c)

M. Lis et al.

(d)

(g)

b4205-v2-ch01

Figure 1.1. Non-sinusoidal oscillations recorded during various neurophysiological experiments: the mu rhythm (a), beta oscillations (b), theta oscillations (c), slow neocortex oscillations (d), gamma oscillations of pyramidal neurons (e), simulated oscillations in the Morris–Lecar model (f) and alpha oscillations in the rat gustatory cortex (g). Reproduced from Ref. [71] with permission. Copyright Cell Press 2017.

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

(b)

page 10

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics

(a)

b4205-v2-ch01

11

(b)

Figure 1.2. Electrochemical oscillations recorded in various corrosion experiments (a) and the oscillatory window determined for corrosion of iron in sulfuric acid solutions (b). Adapted from Ref. [72].

be found, then oscillations within the range of specific potential can occur (Figure 1.2(b)). It usually happens within the so-called Flade potential window, for example, a potential region on the boundary of passive and active zoned of metal surfaces exposed to electrolyte.75 As an example of anodic-based oscillator can serve the process of nickel dissolution during galvanostatic electropolishing in sulfuric acid or sodium sulfate in presence of chloride anions, as examined, respectively, by Doss and Deshmukh76 as well as by Hoar and Mowad.77 The second group has observed periodically occurring dark brown layer on the surface of the electrode and proposed a mechanism based on two charge-transfer reactions following one another (1.4) and (1.5). 2Cl− → Cl2 + e− Cl2 + Ni → Ni

2+

+ 2Cl

(1.4) −

(1.5)

The first reaction was considered as slower and was depolarized by the second one. Furthermore, the oscillation has not been recorded until, at a certain value of current density, the concentration of chloride obtained adequate limit which promotes the anode dissolution (it can be called “the active state”). At some stage the transport of Cl− slows down, which promotes the rise of electrode potential and

page 11

August 3, 2021

12

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

other reactions can take place (1.6)–(1.8). − SO2− 4 → SO4 + 2e

(1.6)

SO4 + H2 O → 2H+ + SO2− 4 +O

(1.7)

Ni2+ → Ni3+ + e−

(1.8)

After immediate hydrolysis of Ni3+ ions the film of Ni(OH)3 /NiOOH forms and blocks the surface of the electrode which in turn causes an increase of the cell potential. The last reaction (1.9) of this cycle occurs simultaneously and leads to the dissolution of blocking layer, the cell voltage decreases, surface becomes more porous and the whole process starts over. Ni3+ + Cl− → Ni2+ + Cl

(1.9)

This type of oscillators is the most typical and often called “corrosion oscillator”. Its mechanism is the best known and has a great potential for modeling. In the literature there are also systems based on anodic oxidation of non-metallic compounds such as formaldehyde78 or hydrogen.79 “Non-metallic anodic oscillators” more often consist of much more complex mechanisms with numerous intermediate radical products. Cathodic processes are also known but they gather less attention because of the more complex nature/less visible effects. However some examples can easily be found in literature.80 As an example the “In/SCN oscillator” can be shown, which was firstly described by De Levie,81 but was successfully investigated by Koper and Sluyters53, 82 and others. Oscillations were observed during the thiocyanate-catalyzed reduction of indium ions on mercury electrode. This phenomena is based on two reactions, the first one (1.10) is slow due to diffusion of In3+ and the second faster (1.11) (responsible for negative resistance in the system) — increased desorption of thiocyanate anions and reduction of indium. In3+ + 2SCN− → In (SCN)+ 2 (ads)

(1.10)

o − − In (SCN)+ 2 (ads) + 3e → In + 2SCN

(1.11)

page 12

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics

b4205-v2-ch01

13

The instabilities in the system are observed during slow potential scans. There is an interplay of potential-dependent thiocyanate absorption on mercury, diffusion of reagents and thiocyanatecatalyzed In3+ reduction at the mercury electrode. With decreasing negative polarization of the electrode after initial increase of current intensity, a decrease of the current is observed. Subsequently at more negative potentials the current starts to oscillate. It is associated with the electrostatic repulsion of free thiocyanate anions formed when surface indium thiocyanate complex is reduced (1.11). This results in apparent negative charge transfer resistance of the electrode. This in turn results in current oscillations, as the formal potential imposed on the working electrode is perturbed due to the influence of negative resistance on the potentiostat feedback loop.83 Looking at examples of the electrochemical oscillator, it is possible to create models with “active-passive” states based only on chemical reactions and their visible changes on the electrodes (thin film), however more accurate models should include diffusion and local concentrations of reagents, therefore more complex equivalent circuits for electrochemical impedance experiments are required.84 Regardless of whether the systems with anodic or cathodic type of oscillations will be considered, the similarity to biological systems, the possibility to model their behavior as well as the ability to time-space coupling, determine their potential as candidates for the development of such fields, which are discussed in this chapter. In the literature it has been reported by Okamoto et al. not only the analogy between oscillatory behavior of oxidation of formic acid and human nerve cell,70 which share the characteristic features as a threshold, a refractory period and a stimulation-dependent response, but also Suzuki85 proposed the mathematical model of single neuron which helped understand propagation of specific waveform and velocity in biological structures. But researchers are able go much further than simply mimicking the behavior of natural cells — Liao with coworkers86 discovered material exhibiting a completely new feature, namely, electrochemical oscillations coupled with memory effects. This may lead to the conclusion that not only may the oscillations be the source of the neuromimetic effects, but also the reverse

page 13

August 3, 2021

14

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

causal relation should be considered. The newest approach to the application of electrochemical oscillators is based on exploration of the coupling phenomena. Coupling can be induced between two or more electrodes connected in series or arrays which was successfully described by Jia87 and Zhai.88 This kind of conjunction can emerge from different factors such as potential overlapping or mass transport — reagents can migrate from one point of the system to another and influence the behavior of the coupled sites.89 Moreover, the coupling may have spatial90 or spatio-temporal91 character, which indicates that both static and dynamic patterns may result from coupled oscillatory processes. The occurrence of this phenomenon extends the possibility of using oscillators in chaotic,92 synchronizing,93 and decoding57 systems as well as becoming great building blocks of artificial neural networks94 and self-organizing matrices.95 Maybe they are the future of cheap, biomimetic, and easily programmable in materio computers?27 1.3. Computation and Control in Dynamic Systems 1.3.1. Computation in memristive devices and systems Recent findings in the field of computer science delivered stunning proof of capabilities of AI. However, the rapid development of AI is becoming more and more limited by the computational speed and power consumption of modern computers. As a result, training a large neural network can emit carbon dioxide in amounts 5 time larger than these emitted by a car during its lifetime.96 These reasons lead to growth of interest in an alternative, unconventional computing concept, involving mimicking the learning processes in the human brain. Despite operating with much slower frequencies than CPUs, our brains can process massive amounts of information in parallel fashion, requiring only a fraction of power consumed by computers performing similar tasks. It is estimated that performing complicated tasks, such as playing the game of Go, required the AlphaGo (equipped with 1202 CPUs and 176 GPUs) to consume even 50,000 times

page 14

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

From Oscillatory Reactions to Robotics

15

more power than a human brain performing the same task.97 The powerful processing abilities attracted attention to design devices and architectures that process information in a way inspired by neurons, but which could still be integrated with classical modern electronics. Nowadays, efforts to make machines compute in a more human manner have become a thoroughly investigated field of science, stretching from software implementations of neural networks and deep learning, through material design of neuromorphic devices to integration of physical artificial synapses on a chip. The first ideas involving the concepts of in materio computing or compute by physics date back to the 1950s. The idea to utilize Kirchhoff’s and Ohm’s laws has been applied to solve partial differential equations, for image filtering, motion computing, and neural network algorithms.98–101 Unfortunately, the performance of these systems was overshadowed by the rapid development of fast CMOS architectures and the compute by physics architectures lost their attractiveness. Almost during the same time, the concept of neuromorphic computing was conceived. Although it dates back to the 1950s, the first practical applications employing neural networks appeared in the late 1980s only and originally involved mimicking the behavior of biological neural networks with analog electronics.102 In recent years, this term has been stretched to many different implementations, including analog, digital, hardware, and software models of neural networks. The development of novel non-volatile memories and memristive devices, however, brought growing interest in hardware neuromorphic circuits. The hardware realizations of neural networks with resistive switching devices, transistors, and memristors have proven to be promising alternatives to classical computing circuits, bringing together concepts of in materio and neuromorphic computing. 1.3.1.1. Logic design with memristors/memristive devices One of the very first implementation of memristive devices for computing was their integration with the standard CMOS logic. Due to their variable (e.g., voltage- or charge-dependent) conductance,

page 15

August 3, 2021

16

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

memristors can act as controllable switches or latches that can be opened (high conductance state) or closed (low conductance state) with voltage of proper polarity.103 To be considered for logic design, simple circuits should be able to realize material implication (IMP or IMPLY) — a fundamental logic operation that together with FALSE (function that always yields 0 at the output) operation form a computationally complete logic basis for computing any Boolean function. Another great advantage of memristive devices in logic design is that they can both perform logic operations and store logical values as their conductances, making the logic stateful. It has been showed that simple circuits consisting of resistors and two memristors can perform implication logic, while NAND operation requires three memristors.103 A small number of memristors arranged in arrays have proven to be sufficient for effective computation of all Boolean functions.104, 105 However, the limitation of IMPLY logic is that it requires additional circuit components like resistors and controllers, consumes a lot of power and demands high circuit complexity. Much simpler design can be obtained with memristor-only logic (Memristor-Aided LoGIC, MAGIC).106 MAGIC takes the advantage of resistance switching process in memristors: depending on the direction of current flowing through the device, the conductance can be either increased or decreased. As a consequence, AND and OR gates can be easily implemented with two memristive devices and the final functionality depends on the orientation of memristors within the gate. NOT, NOR, and NAND gate can be built with three memristors, the last two of which are logically complete (the corresponding IMPLY logic gates require additional FALSE operation).105, 106 For practical implementations, it is crucial to decrease the memristive logic circuit footprint, for example, by stacking. Recent work show that design of 3D crossbar arrays is possible with logic primitives consisting of two antiparallel bipolar memristors. Furthermore, robustness of this design allows for implementation of any bipolar resistive switching device, paving the way for architectures with logic operations performed directly within the memory.107

page 16

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics

b4205-v2-ch01

17

1.3.1.2. Matrix vector multiplication The need for manipulating massive data arrays is crucial for many applications, particularly linear transforms (discrete fourier transform, discrete cosine transform) and artificial neural networks (ANNs). However, due to the von Neumann bottleneck the computation and transfer costs are significant and limit the prospects of application of the ANN algorithms on a bigger scale. To train deep neural network (DNN), millions of synaptic weights need to be iteratively updated, demanding constant transfer of huge data structures between the CPU and RAM. This leads to high energy consumption of the order of kilowatts and days or even weeks of computational time. In the conventional von Neumann architecture, the computation of two numbers usually requires many multiplyaccumulate (MAC) operations and constant data transfer between CPU and working memory. Despite the development of the hardware dedicated to accelerate these operations, such as tensor processing units, the manipulation of massive datasets is still considered a time- and energy-consuming computational step. In fact, the speed limitation of constant data moving and fetching is considered as one of the most important bottlenecks for the further development of AI. The matrix vector multiplication (MVM) is a fundamental mathematical operation for many applications, for example, the training and inference of ANNs. In conventional computers, physical separation of processing unit and memory causes constant data shuttling, limiting the speed and causing significant energy consumption. A potential solution to this problem is based on harvesting the inherent non-volatile memory of memristive devices connected together in a crossbar array. The design of such array involves many memristors placed at the intersections of perpendicular row and column electrodes. In such arrays, the computing and memory could be intertwined on a single chip with no need to move data between other components. More interestingly, arrays of memristive devices are capable of carrying out MVM in just a single step.108 The ability to compute MVM is a consequence of Ohm’s law and Kirchhoff’s law. The output current of any element in the matrix is

page 17

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

18

b4205-v2-ch01

M. Lis et al.

given by (1.12): I =G·U

(1.12)

where U is the applied voltage, I is the current read at the output of the device and G is the conductance of an element of the memristor array. As the voltage Ui is applied to the ith row, the resulting total current in the jth column (Ij ) can be expressed as (1.13): Ij = Gij Ui (1.13) i

where Gij is the resistance of a memristive element located in ith row and jth column. To address a particular cell Mij characterized with conductance Gij , one simply needs to bias row i and column j with certain voltage. As a result, the measured output current I = (I1 , I2 , . . . , In ) is an analogue product of a conductance matrix G and an input voltage vector U = (U1 , U2 , . . . , Un ), that is (1.14) I=G×U

(1.14)

By incorporation of transimpedance amplifier into the circuit one can inverse the MVM operation (1.14) to obtain the division (1.15):108 U = G−1 × I

(1.15)

Recently, it has been shown that cross-point resistive memory arrays are capable of performing linear and logistic regression in just one step.109 To find the vector minimizing the error in the regression analysis, one has to solve the Moore–Penrose inverse given by (1.16) w = X+ y = (XT X)−1 XT y,

(1.16)

where w is the solution for Xw = y equation. The equation can be solved by mapping matrix X on the array of conductances with vectors w and y corresponding to voltage and current, respectively. This system is capable of performing logistic regression as well, proving its usefulness in classification tasks.109

page 18

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics

b4205-v2-ch01

19

1.3.1.3. Hardware artificial neural networks The last decade evidenced the power of AI with deep neural networks matching and even outperforming human capabilities in tasks previously regarded as extremely difficult for computers, such as speech, object, and face recognition.110 However, the current DNNs demand for power will shortly become a bottleneck in the entire field of AI. To design an energy-efficient neuromorphic circuit crafted to deal with massive datasets and multiple MVM operations, it is necessary to mimic the processing of information by biological neural networks in a much more accurate way.111 The working principle of ANNs is loosely based on McCulloch– Pitt’s mathematical model of neuron.112 In classical feed-forward fully connected ANN, a neuron is considered as a node that receives, processes and transmits signals to other neurons. Each of the connections (analogy of biological synapses) is characterized with weight, indicating its strength. The neurons are grouped in layers, each connected with the neighboring ones. The learning process relies on adjustment of the connection strength between the neurons. Simple ANNs consists of only one layer between input and output layers, called hidden layers. Neural networks with more than one hidden layer are regarded as deep neural networks. The output signal of neuron is therefore a weighted sum of the outputs from preceding neurons (1.17): yj =

wij xi ,

(1.17)

i

where y is the output signal of the jth neuron that received input signal xi from the ith neurons from the preceding layer, connected with the jth neuron by corresponding synaptic weights wij . Then, the output is processed through a nonlinear activation function, for example, tanh, sigmoid, or ReLU.113 In the most frequently used supervised training algorithm — the backpropagation — the signal from the last layer (usually called the output layer) is compared with the correct values. The calculated error is then backpropagated to the first (input) layer of the network and the synaptic weights are

page 19

August 3, 2021

20

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

iteratively updated over many iterations (epochs) until the error is minimized. Finally, the prediction accuracy of the ANN is tested on new (test) dataset, serving as a benchmark. Thanks to the presence of many neuromorphic effects the memristor-based artificial synapses are likely to play a crucial role in the design of neuromorphic circuits. The properties that distinguish memristive devices as building blocks for bio-plausible ANNs are the presence of more than two conductive states, low programming energy and ability to store the programmed conductance. Due to numerous effects responsible for conductance switching, many different devices are considered for implementation in neuromorphic hardware, namely, resistive switching random access memory (RRAM), the phase change memory (PCM), the ferroelectric and spin-transfer torque magnetic random access memories (FeRAM and STT-MRAM, respectively). RRAM and other resistive switching devices are able to store up to 6.5 bits of information and PCM devices even up to 8 bits.114–116 The energy required to switch between distinct conductive states is as low as few femtojoules and nearly equals to this consumed by neurons and the data retention is estimated to in order of years for some state of the art devices.113, 117 Arrays consisting of 32 × 32, 128 × 8, and 128 × 64 memristive one transistor–one resistor (1T1R) devices have proven to be effective in facial recognition, sparse encoding, and handwritten digit recognition.101, 118, 119 The latter was able to achieve overall recognition accuracy of 89.9% on MNIST dataset, coming close to the software baseline. Convolutional neural networks (CNNs) deal with image recognition and object detection much better than classical ANNs and can be used to further improve the accuracy. Fully hardware implementation of memristive CNN consisting of eight 2048-cell memristive arrays arranged in five layers achieved 96.9% accuracy.120 Larger DNNs have been built with the use of PCM devices. Arrays consisting of 165,000 synapses achieved 82.2% on MNIST dataset and architecture with 1 million devices achieved 93.1% on CIFAR-10 dataset and 71.6% on ImageNet benchmark.121, 122 The accuracy is still lower than that achieved by software solutions, but

page 20

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics

b4205-v2-ch01

21

these results prove that hardware DNNs are capable of dealing with complex visual patterns. The biggest promise of hardware neuromorphic circuits is to significantly decrease energy expenditure. Memristive cross-point arrays benefit from integrating computing unit and memory in one physical device, which significantly speeds up operations of the network. It is estimated that memristive neuromorphic architectures are able to operate at 115 TOPS W−1 (Tera Operations per Second per Watt) while performing MVM.101 In comparison, digital CMOSbased technology doing the same task at lower accuracy is estimated to operate at 7 TOPS W−1 (see Ref. [119]). It is presumed that with proper architecture, energy consumption can even be 1000 times lower than that required by classical circuits.118 However, the implementation of ANNs poses a significant challenge in the design of hardware neuromorphic circuits. In general, synaptic weight can have both positive and negative values, which require a modification to the simple crossbar architecture.123 To obtain negative values, one has to calculate a relative synaptic weight as a difference between a pair of conductances (1.18) wij = Gij − Gref ,

(1.18)

where Gref is some fixed, reference conductance.124 However, this architecture is viable only for devices exhibiting bipolar switching behavior (i.e., when the conductance can be both increased and decreased). In case of unipolar switching devices, like PCM devices, in which conductance can be tuned only in one direction, the calculation of the synaptic weight requires second tunable conductance to yield both positive and negative values, that is (1.19) − wij = G+ ij − Gij .

(1.19)

Other significant issues posing learning ineffectiveness are deviceto-device variations, nonlinearity of resistive switching and lack of symmetry of switching between high and low conductances, both leading to lower prediction accuracy.125–127 It has been shown that even a few percent in linearity discrepancy between devices can lead

page 21

August 3, 2021

22

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

to decrease in the inference accuracy.128 These issues can be overcome with feedback programming algorithms to introduce devices into the desired conductance state and selecting narrower conductance window.101 Programming techniques that take into account the variability of the resistive switching among devices in the array to overcome have proven to effectively increase the inference accuracy as well.125, 129 Spiking neural networks (SNNs) can emulate the behavior of biological neurons in a much more reliable way than classical ANNs. In the brain, both the time of neuron firing and its position in the network carries information.130 The information is encoded with binary spikes and only specific spiking patterns induce neurons to emit the action potential. Thus, human brains are extremely efficient in processing complex, spatio-temporal data like speech and vision. Similar rate encoding approach is implemented in SNNs, where information is coded with the pulse width, frequency, and the relative time of its occurrence.111 CMOS-based SNNs neuromorphic technology circumventing von Neumann architecture limitations are already available, including SpiNNaker supercomputer, Intel Loihi, and IBM TrueNorth neuromorphic chips.131–133 While being state of the art neuromorphic circuits, these solutions employ complex circuits of classical electronic elements to emulate synaptic behavior. Due to the innate presence of synaptic effects, memristive devices could significantly simplify the design of hardware SNNs. The presence of numerous neuromimetic effects including Hebbian learning rules, spike timingdependent plasticity (STDP), spike rate-dependent plasticity, pulsedpair facilitation, metaplasticity, associative, and non-associative learning allow direct and efficient implementing of spatio-temporal learning rules.134–138 Additionally, SNNs are able to carry out unsupervised training, where the network autonomously learns a pattern based on the data submitted to the input.111, 139 SNN consisting of only 16 synapses trained with unsupervised weights update via STDP showed the ability to learn static and track dynamic patterns.140 It is noteworthy that in SNNs, only devices that receive a spike become active, while

page 22

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

From Oscillatory Reactions to Robotics

23

the rest remain idle. This is a highly advantageous feature, as it will allow even further reduction of energy consumption.141 1.3.2. Principles of control in dynamic systems — PID case The control theory deals with the analysis of various dynamic systems and methods of constructing controllers. The most common control systems are based on feedback loops, where the controlled signal is compared with the given reference signal, after which the differences between them (i.e., error) are used to calculate the corrective control action.142–144 Currently one of the most widely used mechanisms of control is the proportional-integral-derivative (PID) controller.142 The first theoretical work creating mathematical foundations and describing the operation of the PID controller appeared in 1922.145 The purpose of Nicolas Minorsky’s work was to determine the conditions of stability — in his work he used his sailing intuition, which was based on the fact that the helmsman controlled the ship not only based on the current error, but also took into account errors that occurred in the past and the current rate of change. His concepts were implemented on the battleship USS New Mexico to control angular velocity. The use of the “PI” controller (using only proportional and integral components) enabled the angular error to be reduced to ±2◦ , while the addition of the “D” component helped reduce the error to ±1/6◦ , which was much better than any helmsman could achieve.146 Since then, the academic community has become deeply interested in the subject of PID controllers, as evidenced by international conferences and constantly developed models and emerging patents.147–149 The PID control device is based on continuous operation through the use of the feedback loop and appropriate corrective actions of its three calibrated components. The equation written in parallel form, specifying the output from the classic PID controller operating in a continuous manner is as follows (1.20): t de (t) , (1.20) e (t) dt + KD uPID (t) = KP e (t) + KI dt 0

page 23

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

24

where uPID is the output of the PID controller, KP , KI , and KD denote coefficients of proportional, integer, and derivative gains, respectively, and e(t) is the error. The standard form of representation of Equation (1.20) is often presented in the form (1.21): t 1 de (t) , (1.21) e (t) dt + TD uPID (t) = KP e (t) + TI 0 dt where physical meaning of TI and TD are interpreted as time constants of integration or derivation processes. The block diagram of the PID controller is depicted in Figure 1.3. The principle of operation of the PID controller is as follows: The “P” component is responding to the error e(t) with the proportional gain KP . The use of a proportional component alone is not sufficient to achieve the desired system variable, due to the fact that when the error approaches zero, the correction applied also approaches zero. The integrating element attempts to counteract this by effectively accumulating the error result of the “P” component in order to increase the correction factor. However, instead of stopping the correction after reaching the goal, “I” attempts to reset the cumulative error to zero, which causes overshoot. The “D” component aims to minimize this overshoot by slowing down the correction factor applied as the desired system value is approached. P K Pe (t ) r (t )

+

∑

e (t )

+ I K I ∫ e ( t ) dt

– D KD

de ( t ) dt

+

∑

u (t )

Process

y ( t ) = Ξ ⎣⎡u ( t ) ⎦⎤

y (t )

+

Figure 1.3. Schematic representation of PID controller. The symbols represent: r(t) is desired process value, e(t) is the error calculated from the difference of r(t) and y(t) — measured process value. Participation of P — proportional, I — integral, D — derivative components are summed and are applied to given process as u(t) — control signal. The process itself transforms the control signal operator can be regarded as a response function of into a feedback y(t). The Ξ the controlled process.

page 24

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics

b4205-v2-ch01

25

It is estimated that about 90% of industrial processes are controlled by PID systems.150 They are used to regulate flow, temperature, pressure, humidity, level, velocity, or other parameters of industrial importance. The popularity of using this type of control is mainly due to its simplicity, applicability, and the intuitiveness of the impact of individual components on the system dynamics.151 Many approaches towards tuning of PID controllers are used to obtain fast and acceptable performance of a given process, for example manual tuning, different heuristic methods, model based methods, or even auto-tuning provided by the supplier. Classic heuristic methods of tuning PID systems include methods Aström– by Cohen–Coon,152 Ziegler–Nichols,153 Tyreus–Luyben,154 ˚ 155 or Internal model control IMC-PID tuning rules by Hägglund Rivera et al.156 When the dynamics of a given system is complex, nonlinear, nonstationary, difficult to accurately characterize or is subject to environmental uncertainty, there may be a need for more advanced control system. Intensive research is conducted on controllers using, for example, fractional calculus,148, 157 fuzzy logic,149, 158 artificial neural networks,159, 160 or using hybrid approaches.161–163 In order to obtain a broader description of the methods of tuning PID control systems, the interested reader is referred to the topic reviews.147, 164–169 More and more advanced methods of PID control are still being developed, aimed at better control over the time delay of the system response, avoiding overshoot or oscillations. At the same time, industry-related data present sobering results about operating PI and PID control systems. Industrial plant inspections show estimates that around 80% of used PID controllers are poorly tuned. It was found that 30% of PID controllers are in manual mode and that 25% of all installed PID controllers use the factory default settings, which means that they have not been tuned at all.142, 170, 171 This does not change the fact that scientists are trying to develop and incorporate PID technology into larger control systems as well as develop related technologies that are based on similar concepts of feedback control which is the topic of the next section.

page 25

August 3, 2021

26

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

1.3.3. Reservoir computing Just like in the case of the Internet, smartphones, 3D printing, e-cigarettes or electric cars to name a few, the development of artificial intelligence is becoming an integral part of human life. We use social media, online shopping, various video sharing platforms, communicators, etc. Each of these areas can be subjected to analysis and profiling to generate value through more personalized ads, news feeds, or videos which may be of interest to us.172 This is intended to draw more human attention by suggesting personalized content — including aesthetic, intellectual, or political preferences. The collection and analysis of data translates into a real effect in the world in the form of monetary gains (e.g., AI supported product/stock market analysis173, 174 ) or politics (the case of Cambridge Analytica affecting the presidential election in the USA175 ). The analysis and processing of the data listed above (in addition to classical statistical methods) is carried out using concepts belonging to the field of machine learning (ML). One of ML’s intensively developed branches are various ANNs.176 In their functionality and/or structure they are modeled on the basis of the biological nervous systems. This is due to the brain’s extremely optimal ability to recognize patterns and classify them as well as its ability to learn. For this reason, ANN are widely used for various tasks such as classification, prediction, generation, or filtering of the data. They turn out to be valuable tools if we are interested in modeling of various nonlinear processes treated as a “black box”. Generally, neural networks can be divided into two subclasses depending on the direction of the data flow — feedforward neural networks (FNNs) and recurrent neural networks (RNNs). Due to the characteristics of these networks, FNNs are more suitable for processing information that is static — that is, that are non-temporal dependent, whereas RNNs turn out to be more suitable for temporal data, where we are interested in modeling the dynamics of a given system.177–179 Control systems are time-dependent entities, so in this chapter we will focus only on the description of new solutions from the RNN class. The flexibility and complexity of RNNs surpass the performance of classical PIDs, which are, by definition, limited

page 26

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

From Oscillatory Reactions to Robotics

Inputs u(t)

Reservoir W

in

Outputs W

27

Classes

out

1 2

n

W

Figure 1.4. Schematic diagram of classical reservoir computing system. Connections between input layer and the reservoir layer, as well as the connections between hidden nodes in the reservoir are all fixed (black). The only trained connections can be found between the reservoir layer and the output layer (red). Output layer acts as a decoder of a particular reservoir parameters at a given time. Based on the obtained parameters, classification of input signals can be performed.

to a single feedback loop with no delay and a relatively simple nonlinear core. One of the RNN approaches is particularly advantageous from the point of view of physical systems — reservoir computing (RC). In RC, a set of hidden neurons is called a “reservoir” intended for mapping input into higher dimensions through nonlinear transformations, so that it can be classified with linear transformations at the output layer (Figure 1.4). In the general case the response of a reservoir to the external input vector u can be described by a recursive equation (1.22) H Wfb y (t − δ) , (1.22) x (t) = F Wx (t − 1) , Win u (t) , δ=0

where x (t) is the vector describing the internal state of the reservoir at discrete time t, W is the matrix of internal weights of reservoir, Win is the matrix of input weights, Wfb is the matrix of feedback weights, y is the output signal used in feedback, H defines the depth of memory, and F is the activation function of the reservoir. The state of the reservoir is a vector defined by the states of all of its

page 27

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

28

nodes, N (1.23) x (t) = (x1 (t) , x2 (t) , . . . , xN (t)) ,

N ∈ N.

(1.23)

The readout layer may use only a subvector of the internal state of the reservoir, that is (1.24) y (t) = (x1 (t) , x2 (t) , . . . , xj (t)) ,

j ≤ N.

(1.24)

No training is needed in the reservoir layer, which is a huge advantage in terms of speed of operation. The only training involves the readout later, which may be a simple linear transformation of the output vector (1.25): ϕ (t) = Wout ψ (t) ,

(1.25)

and the delayed feedback in the more advanced cases (vide infra). It is especially important in the case of in materio implementations of reservoir computing, where no adjustments inside the physical reservoir can be done.180, 181 Because of this, the reservoir must be complex enough to perform nonlinear transformations suitable for data classification and modelling. Among all computational paradigms reservoir computing seems to be best suited for unconventional in materio implementations,15 although selection of a proper physical platform is not a trivial task.182, 183 The responses of the reservoir must possess properties of generalization and separability — similar inputs are mapped into similar states of the reservoir whereas differing inputs are mapped into its different states.185 These features can be significantly improved by addition of additional trainable elements: a delayed feedback and a drive, as recently reported by Anathasiou and Konkoli (Figure 1.5).184 Most importantly, the drive signal may influence the global dynamics of the reservoir in such a way that the reservoir can perform various computational tasks with the input data. This approach enables decrease of reservoir complexity, allows at least partial control over internal reservoir dynamics and improves memory features, thus helping to beat the nonlinearity-memory trade-off. Other significant improvement of reservoir performance may be achieved by careful engineering of nodes — a combination of

page 28

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

From Oscillatory Reactions to Robotics

29

Input u(t) Outputs

Reservoir

Win

W

Classes

out

1 2

n

W Wĩ

Δτ

Figure 1.5. Schematic diagram of a modified reservoir computing systems. Additional feedback and the trainable drive library are marked in red. See Ref. [184] for details.

linear and nonlinear nodes helps to mitigate the memory-nonlinearity trade-off.186 This approach, however is more suitable for softwarebased reservoirs, as in the case of in materio implementations it would require extensive control of materials at nanoscale. The interested reader is referred to broad literature for a formal description of its concepts.27, 184, 187–194 1.3.4. Reservoir computing and control systems One can envision a full hardware integration of the controlled systems with the control system, where the processing of the multisensory data is carried out in an online manner. Such solutions are already available on the market (e.g., SPOT from Boston Dynamics with +EDGE GPU), but they are not efficient compared to dedicated ML computing equipment195 and are expensive. For this reason, research is being conducted on effective platforms that process information in a convenient way (e.g., memristor crossbars capable of matrix operations101, 196 ) for ML applications. Since RC is one of the efficient ML methods for processing nonlinear dynamic data, it seems natural to try to use this method for systems control applications. Relation of the RC with the control systems can be manifold. RC can be used in a similar way as

page 29

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

30

the PID control system (vide infra), or in parallel with the PID to enhance its operation, and can be trained on the basis of PID; in an inverse situation, system under control can be used as reservoir of states to perform computations. These approaches will be briefly illustrated with examples. It is worth mentioning that at first, presented applications for system control are based on a software implementation of RC, followed by some recent hardware RC implementations. The relation between PIDs and RCs can be noticed at various levels. One of the widely used approaches to reservoir computing is the implementation of single node echo state machines.197, 198 They are based on a single nonlinear node placed in a feedback loop with appropriate delay and gain elements, which provide fading memory feature. The conceptual scheme of such a system is shown in Figure 1.6 and the block diagram of practical application in Figure 1.7. The input signal for such a system must be appropriately conditioned (e.g., divided into constant-time chunks, masked with appropriate masks, combined with proper drive signal, etc.)199 and fed into the feedback loop at appropriate moments. τ2

Inputs τ1

Δτ

Outputs τ3

1 τ4

τn

Real node

Classes

τ5

2

n

Virtual node

Figure 1.6. Single node implementation of a reservoir computing concept. The output layer follows the evolution of the input signal at various time instances. It is assumed that the node returns to its initial condition upon n steps, therefore single computational run lasts (n + 1) · Δτ . Trained elements (drive and output layer weights) are marked in red.

page 30

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics

b4205-v2-ch01

31

In this approach signal chunks of the length not exceeding the given time interval of Δτ are fed into the feedback loop and the signal evolution is observed with the time spacing of Δτ . In this way a series of virtual nodes presenting the input signal at different stages of evolution is obtained. The simplicity of this approach is a trade-off versus the performance. The number of virtual nodes and the length of the signal chunk determine the time-scale of the data processing, therefore this approach is mainly used in photonic systems,200–207 however electronic implementations based on memristive elements are also known (Figure 1.7).180, 208–210 So far, hardware implementations of RC have been shown to act as efficient ANN in materio devices but without showing its possible application to control systems. Based on previous reports on RC software, it can be assumed that future research will include full hardware integration of the ANN control unit and the controlled system. RC’s ability to work effectively in continuous online mode is a big advantage here. Some new RC hardware implementations are described below. Zhu et al.211 presented hardware reservoir implemented on the single memristor operating in the feedback loop to capture and analyze neuronal spike trains in real time. The perovskite-based Ag/CsPbI3 /Ag memristor showed a good representation of the applied neural signal in its dynamics of operation, which was further enhanced by the feedback loop. In addition, low voltage (>100 mV) and low operating current (∼nA) made this

output

input

SMU

GAIN

Δτ

Figure 1.7. Block diagram of an experimental setup for memristor-based reservoir computer (a single node echo state machine). SMU stands for sourcemeasure unit, GAIN for amplifier and the delay line is labelled Δτ . Adapted from Ref. [28].

page 31

August 3, 2021

32

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

system suitable for working with a biological signal. Such a system could prove to be a good candidate for control systems due to its low power consumption and the possibility of continuous operation. In an attempt to reduce required information processing elements, Wla´zlak et al.209 showed possibility to classify signals based on their amplitudes, without any necessity for readout neural network or any transformation whatsoever apart from simple threshold. The ITO/PbI2 /Ag memristive device operating in the feedback loop was used as the computational RC substrate. Furthermore, device exhibited spike-timing-dependent plasticity (STDP) and spike-ratedependent plasticity (SRDP) Hebbian learning rules. Simplifying the required information processing while maintaining certain functionality may also be important from the point of view of control systems. In a different setup, Vandoorne et al. studied RC based on the photonic silicon chips. As this system did not show any nonlinear operation by itself, the desired nonlinearity was introduced at the readout layer. The great advantage of the system was its high speed of operation, enabling data processing at the level of 12.5 Gbit per second. More implementations can be found in recent review papers.183, 212 In one study, Schwedersky et al.213 explored performance of RC in comparison with PID system to control refrigerant compressor. Since this device shows areas of nonlinear operation, the RC-based control showed more than two times less relative error during operation when testing the experimental setup. Zhang et al.214 showed comparison of RC with other classical ML techniques for a transmission faults monitoring of a 3D printer. In a situation where we have access to a small number of resources, a low-cost data acquisition and computation are crucial. In that scenario, RC presented outstanding performance in relation to other ML approaches. Wu et al.215 studied joint system of RC and PID control of rehabilitation robotic arm showing better accuracy than simple PID controller. Sala et al.216 showed that RC can be effectively trained on the basis of the PID responses and in effect was able to correct offsets of controlled robotic arm. Nakajima et al.217 showed an interesting approach to employ

page 32

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics

b4205-v2-ch01

33

model of soft-robotic arm (inspired on the octopus tentacles) to use as a reservoir of states. Soft-robotics is characterized by the smooth movements observed in nature, which makes it be perceived as highdimensional, elastic and nonlinear. The dynamics of a robot soft arm have been successfully used for approximation of three benchmark nonlinear dynamical systems. In the following work,218 group presented hardware robotic arm used to perform computations. The interested reader will find more about soft robotics in the following subsections. The single node echo state machine implementation of reservoir computer is structurally related to the PID controllers, which however do not have the explicitly defined delay line and their nonlinear response is strictly defined. The other difference related to the feedback loop — in the case of PIDs the delay/memory function can be embedded in the controlled element (e.g., in the form of its inertia). This analogy can be further explored for a very specific class of input data: signals satisfying the Dirichlet conditions for expansion of a function into a Fourier series. Such signals should be: (i) periodic functions absolutely integrable over its period, (ii) must be of bounded variation in any given bounded interval (i.e., it should have a finite number of local minima and maxima within a period) and (iii) must have a finite number of discontinuities in any bounded interval. All functions which obey the Dirichlet’s criterion can be represented as Fourier series, for example, (1.26), f (x) = A0 +

N

An sin (nx + ϕn ).

(1.26)

n=1

Furthermore, operation of PID controller should be performed with integration time significantly shorter than the period of input oscillations. Thus, application of the two operators of PID to each of the Fourier components (it is possible due to the linearity of differential and integral operators) sin functions yields cosine functions (1.27) and (1.28): d sin x = cos x dx

(1.27)

page 33

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

M. Lis et al.

34

and

b4205-v2-ch01

sin xdx = − cos x.

(1.28)

Thus, the response function of PID for any Fourier component of the input signal will be given by (1.29) f (x) = KP sin x + (KD − KI ) cos x which simplifies to (1.30) f (x) = KP2 + (KD − KI )2 sin (x + φ)

(1.29)

(1.30)

with the phase shift equal to (1.31) ⎧ KD − KI ⎪ , KP = 0 ⎨ arctan KP φ= ⎪ ⎩ sgn (K − K ) π , K = 0. D I P 2

(1.31)

Thus, the nonlinear operator of PID can be represented by a fractional derivative27, 219–221 (following the sign convention used in memfractive devices222 ) of the order of (1.32)223 : ξ=−

2φ π

(1.32)

which automatically leads to conclusion that the PID systems, when subjected to periodic signals satisfying the Dirichlet’s conditions, have power-law memory features.224, 225 Thus, PID systems can be considered as simplified reservoirs with fading memory, provided that

is: (i) a linear operator or (ii) a function, the process function Ξ which satisfies the Dirichlet’s conditions within the image of the input function f (x). The detailed considerations on performance of such systems require spectral analysis of composite functions, discussed in detail by Bergner and Muraki.226 Specific operating conditions

are (periodic input and a limited choice of process operators Ξ) not usually met during operation of PID systems. However, this demonstrates close formal and functional relation between these

page 34

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics

b4205-v2-ch01

35

can induce additional phase shifts or two systems, especially if Ξ represents a delay line. Detailed analysis of delayed PID systems was

PIDs presented by Silva et al.227 With appropriate properties of Ξ, can be applied to signal classification according to spectral features or signal filtration, which extracts desired signatures from complex input signals. In summary, control systems can be implemented in a variety of ways, strictly depending on the dynamics of the system under consideration. Both classic control systems such as PID as well as more advanced ML-based systems open up many research paths aimed at increasing their accuracy and functionality compliance. It seems that the world is striving for more and more automation, using more and more advanced and sophisticated systems, which we have tried to briefly outline in this section in regard to PID and ML concepts. 1.4. Controllers Beyond PID: Fuzzy and Neuromorphic 1.4.1. Fuzzy logic Human intelligence has the remarkable power of computing with both numbers and words. A good model of the human ability to make rational decisions by computing with words is fuzzy logic. Fuzzy logic has been defined as a rigorous logic of vague and approximate reasoning.228 It is based on the theory of fuzzy sets proposed by the engineer Lotfi Zadeh.229 A fuzzy set is different from a classical Boolean set because it breaks the law of Excluded Middle. An item may belong to a fuzzy set and its complement at the same time, with the same or different degrees of membership. The degree of membership (μ) of an element to a fuzzy set can be any number included between 0 and 1. It derives that fuzzy logic is an infinite-valued logic. It is used to design controllers because it can describe any nonlinear cause and effect relationship. For this purpose, it is necessary to build a fuzzy logic system (FLS). The construction of any FLS requires three fundamental steps. First, the granulation of all the variables in fuzzy sets. The number, position, and shape

page 35

August 3, 2021

36

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

of the fuzzy sets are context-dependent. Second, the graduation of all the variables: each fuzzy set is labeled by a linguistic variable, often an adjective. Third, the relationships between input and output variables are described through syllogistic statements of the type “If . . . , Then . . .”, which are named as fuzzy rules. The “If . . .” part is called the antecedent and involves the labels chosen for the input fuzzy sets. The “Then . . .” part is called the consequent and involves the labels chosen for the output fuzzy sets. In the case of multiple input variables, these are connected through the AND, OR, NOT operators.230 At the end of the three-steps procedure, an FLS is built; it is a predictive tool or a decision support system for the particular phenomenon it describes. The effectiveness of fuzzy logic in mimicking the human power of computing with words is due to the structural and functional analogies between any FLS and the human nervous system.231 A significant challenge in the field of Chemical Artificial Intelligence is the design of strategies to process fuzzy logic by using molecules, macromolecules, and systems chemistry.232 1.4.2. Processing fuzzy logic by using molecules The microscopic world is ruled by the laws of quantum mechanics that have some links with fuzzy logic.233, 234 The elementary unit of quantum information is the qubit. The qubit, |Ψ, is a quantum system that has two accessible states, labelled as |0 and |1, and it exists as a superposition of them (1.33) |Ψ = a |0 + b |1.

(1.33)

In Equation (1.33), a and b are complex numbers that satisfy the normalization condition |a|2 +|b|2 = 1. Any logic operation on a qubit manipulates both states simultaneously. It determines an evolution of |Ψ represented by the product of |Ψ and an orthonormal operator

The new state |Ψ is given by (1.34) O.

|Ψ. Ψ = O

(1.34)

page 36

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics

b4205-v2-ch01

37

The new state |Ψ still satisfies the normalization condition. In other words, (1.35):

O|Ψ = |Ψ = 1.

(1.35)

When a qubit undergoes any kind of measurement represented by a projector pˆ, the probability of an outcome is defined as (1.36) Ψ| pˆ |Ψ.

(1.36)

The projector (1.37) pˆ =

|i i|

(1.37)

i

is a linear operator defined over a set of orthonormal vectors |i. Multiplying a projector with a state vector |Ψ means to project the vector onto the respective vector subspace. The probability value of Equation (1.36) equals the squared length of the state vector |Ψ after its projection onto the subspace spanned by the vectors |i. Such value may be interpreted as the degree of membership of |Ψ to the subspace spanned by |i.234, 235 The measurement determines the decoherence of the qubit.137 The decoherence induces the collapse of any qubit in one of its two accessible states, either |0 or |1, with probabilities |a|2 and |b|2 , respectively. The decoherence is also induced by deleterious interaction between the qubit and the surrounding environment, which is a heat reservoir. Whenever the decoherence is unavoidable, the single microscopic units can be used to process discrete logics, that is, binary or multi-valued crisp logics depending on the original number of qubits.6, 10 The relation between quantum logic and fuzzy logic can be established also at topological level. It is possible to represent a real-valued qubit as a circle in two dimensional Hilbert space. It is homeomorphic with a unit square, which specifies two membership functions.236 The intuitionistic analysis of fuzzy and quantum logic operators also points to striking similarities between fuzzy logic and quantum computing.237 The intuitionistic argument should be, however, used consciously, as there are also marked formal and

page 37

August 3, 2021

38

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

practical differences between quantum and fuzzy logic. As pointed out by Lehrack et al.: “As main difference between fuzzy and quantum logic we identified the way how conditions are combined by conjunction and disjunction with respect to a given object: combination in quantum logic is performed before and in fuzzy logic after object evaluation takes place”.235 Advanced microscopic techniques, reaching the atomic resolution, are required to carry out the computations with single molecules. Alternatively, large collections of molecules can be used to make computations. However, vast ensembles of molecules (if they are of the order of the Avogadro’s number) are bulky materials. The inputs and outputs for making computations become macroscopic variables that can assume continuous values. When the function that relates input and output variables is steep, it is suitable to process discrete logic. On the other hand, when the function is smooth, it is suitable to implement an FLS.238–241 Every compound that exists as an ensemble of conformers works as a fuzzy set.242 The types and the relative amounts of the different conformers depend on the physical and chemical contexts. Every compound is like a word of the natural language, the meaning of which is context-dependent. Conformational dynamics and heterogeneity enable context-specific functions to emerge in response to changing environmental conditions and allow the same compound to be used in multiple settings. The fuzziness of a macromolecule is usually more pronounced than that of a simpler molecule because it exists in a larger number of conformers. Among proteins, those that are completely or partially disordered are the fuzziest.243 Their remarkable fuzziness makes them multifunctional and suitable to moonlight, that is, play distinct roles, depending on their context.244 When compounds that exist as a collection of conformers and that respond to the same type of either physical or chemical stimulus, are combined, they granulate the variable in a group of molecular fuzzy sets. They work in parallel and allow to discriminate different values of the same variable easily. This strategy is at the core of the sensory subsystems of the human nervous system.231 Its imitation allows to develop artificial sensory systems that are

page 38

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics

b4205-v2-ch01

39

strongly sensitive. For instance, the imitation of the visual sensory system, by using a collection of adequately chosen direct thermally reversible photochromic compounds has allowed the implementation of artificial chemical systems that extend human vision from the visible to the UV. Such systems discriminate frequencies belonging to the UV-A, UV-B, and UV-C regions, respectively.245, 246 Another attractive platform, in terms of both structural stability and flexibility of possible modifications (e.g., through intercalation of various small molecules), is provided by nucleic acids. These macromolecules provide a sufficient number of states to implement quasi-continuous variables, hence the definition of fuzzy sets, as well as the construction of a FLS, become relatively straightforward. These concepts were discussed by Deaton247 and later by Zadegan,248 who implemented functionalities of Boolean logic gates and designed an FLS based on the use of the Förster resonance energy transfer (FRET). At the same time, simpler molecules may also be utilized for the realization of fuzzy logic operations owing to their electrochemical or photochemical properties. One of such compounds was presented by Karmakar249 — the emission profiles of polypyridyl-imidazole based complex of ruthenium and their dependence on the presence of selected ions (Fe2+ , Zn2+ , F− ) were utilized in this instance. 1.4.3. Implementation of fuzzy logic systems in solid-state devices Although the use of molecular systems in the context of fuzzy logic implementation opens a variety of paths towards future applications — especially in scenarios, where some degree of interaction with the environment is required — some major drawbacks are still present. This is mainly due to the necessity of working in solutions, which in turn hinders the concatenation of such devices, makes them incompatible with the conventional, silicon-based architectures and due to specific requirements concerning input/output operations, impede interfacing in general. At the same time, a substantial amount of research effort is put into studies on the utilization of solid-state materials in the

page 39

August 3, 2021

40

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

construction of information processing devices realizing in-materio computing concept, based on multi-valued logic systems (including fuzzy logic).5, 194 These studies aim as well for the compatibility of new systems with classic electronics, which is achievable, since most of these devices utilize electrical signals as input and/or output. Moreover, this approach gives an opportunity to harness molecules, which may be used in a similar manner to the abovementioned (e.g., to facilitate interactions with the environment) or in order to modify the properties of a base material. A good example of a solid-state element capable of realizing FLS functionalities is given in the work by Bhattacharjee251 in which a tantalum oxide-based device, exhibiting some memristive properties is discussed, as a suitable platform for designing multi-valued logic gates. Another interesting approach was demonstrated by Xu and Yan, who realized sensing at the molecular level with the use of europium functionalized metal–organic frameworks, fluorescent response of which was used to define a so-called “Intelligent Molecular Searcher” utilizing some elements of the fuzzy logic formalism in order to detect changes in the concentration of selected ions.252 An interesting extension to the aforementioned devices may be realized with the use of light as one of the inputs. The authors of this chapter presented two hybrid materials, both composed of titanium dioxide modified with either anthraquinone (Figure 1.8)250 or cyanocarbons,5 as suitable platforms for the implementation of Fuzzy logic formalism. In these cases, we used the photocurrent generation patterns of the nanocomposites to assign fuzzy sets and define rule bases. Since the proposed devices may be controlled by both applied bias and the change of incident light wavelengths, it is possible to concatenate them with existing optoelectronic elements into more sophisticated networks capable of complex information processing tasks. 1.4.4. Neuromorphic devices A closely related field, which also aims at mimicking some of the fundamental biological structures and processes, focuses on the

page 40

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics (a)

(d)

(b)

(e)

b4205-v2-ch01

41

(c)

(f)

Figure 1.8. The process of fuzzification of input (a) and (b) and output (c) signals in the experiment involving titanium dioxide modified with anthraquinone is depicted. The rules base (d), the original photocurrent generation pattern (e) and the result of defuzzification is shown (f). Adapted from Ref. [250].

implementation of nervous system functionalities within software and hardware frameworks. Here, we discuss exclusively hardware realizations of the neuromorphic engineering concepts with a strong emphasis put on in-materio computing approach. This class of solutions gains substantial interest, as it opens ways for the utilization of unconventional information processing devices in scenarios — such as chemical sensing, multi-valued logic implementation, etc., — in which classic electronic elements are unsuitable. Moreover, the inmaterio computing is more energy-efficient in some applications than the solutions based on CMOS architectures.27, 253–256 The basic operation of neurons and synapses may be recreated within different systems capable of dynamical response to the external stimulus and evolution of its internal state in time. Selected concepts of neuromorphic engineering have already been realized within the in-materio computing framework with the use of both solution-based and solid-state systems. Ideally, these devices should be also easily concatenated into networks of higher complexity, allowing mimicking of neural networks with multiple nods, in order

page 41

August 3, 2021

42

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

to implement more sophisticated functionalities. In the former case (the use of molecules in solutions), this task seems to be difficult to achieve, hence this approach is rather underrepresented in the literature. Nonetheless, one may find several realization of neuromimetic systems falling within this solution-based design — one of the commonly applied approaches assumes the use of chemical reactions, which exhibit a tendency to remain out of equilibrium.232 A good platform meeting this requirement is provided by oscillating reactions, such as the Belousov–Zhabotinsky (BZ) reaction or the Briggs–Rauscher (BR) reaction, which may be used for recreation of spiking patterns observed in biological structures,47, 257 but also for such sophisticated tasks, as image and pattern recognition.62 An interesting variation of such systems is achieved, when luminescent or photochromic compounds are introduced to the system. Since BZ reaction is capable of modulating the optical input in the UV–vis region, these periodic changes will become a perturbation for other light-absorbing compounds, which will synchronize with BZ oscillations in one of two different ways: in-phase — if processes involving an additional agent are fast — or out-of-phase — in the case its response is slow.50 This, in turn, allows realization of different dynamics characteristic for a variety of neural structures, including the functionality of so-called chaotic neurons — in this case the transport processes within solution play an important role48, 49, 66, 67 — the characteristics of which may be utilized in the cryptography or random number generation. Finally, as mentioned before, the concatenation of individual cells is not an easy task, however some internal feedback loops exist within this class of systems, which open the way for the realization of simple neural networks. Changes in concentration and ratios between particular constituents of the system, as well as modifications various parameters describing both the device and optical input(s) enables implementation of different circuits — of both unidirectional and recurrent characters — and their dynamical reconfiguration.50, 68 Despite the aforementioned possibilities of connecting solutionbased systems into more complex architectures and their internal

page 42

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics

b4205-v2-ch01

43

capabilities to realize information processing in a manner similar to simple neural networks, there are some limitations to this approach. A more promising and versatile platform for the design of hardware neuromimetic systems is provided by solid-state elements. Here, three distinctive pathways may be defined: the first one assumes the use of classic, silicon-based electronics and existing, or slightly modified elements — this scenario, as discussed in the former sections of this chapter, is not always energy-efficient and lacks the flexibility (e.g., in terms of possible interactions with light or chemical entities)195, 258 — the second one, quite common nowadays, focuses on the use of memristors and memristive devices194, 259–261 and usually bases on the analysis of electrical inputs/outputs; this approach is discussed in the previous paragraphs — finally, the third one aims at the utilization of interactions of various materials (including nanocomposites) with small molecules, electrical stimuli and/or light, in an attempt to mimic the dynamics of individual neurons/synapses and complex neuronal structures.262–265 Here, we want to focus on the use of hybrid materials and their interactions with light — theses systems may offer response times close to the ones observed for the biological structures and can be fairly easily concatenated, as one of the inputs and an output of such a device are compatible (usually they are of electrical type). Moreover, the interplay between thermodynamic and kinetic aspects of charge carriers generation under irradiation in numerous nanocomposites provides the required level of complexity for the recreation of biological neurons and synapses dynamics (vide infra).5 These features make light-sensitive hybrid materials a perfect platform for the realization of selected concepts falling within the neuromorphic engineering approach. One of the good examples of neuromimetic devices exhibiting short-term memory effect is a binary hybrid composed of cadmium sulfide — which is responsible for the photocurrent generation — and multi-walled carbon nanotubes — which provide additional trapping states for electrons from the conduction band of CdS — sandwiched between two ITO@PET electrodes with an addition of ionic liquid.266 Upon sample irradiation with a pulsed input the

page 43

August 3, 2021

44

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

device exhibits Hebbian-like plasticity.267 Moreover, the profile of generated anodic photocurrent spikes can be controlled through even subtle changes in various experimental parameters (electrode potential, light wavelength, etc.) but also the time characteristics of the optical stimuli (e.g., the length of pulses, their number and interval between them). It is noteworthy that, for a certain set of conditions it is possible to realize within the described system two fundamental modes of the synaptic response — potentiation (an increase in subsequent spikes intensity) and depression (a decrease in the subsequent output signals). This feature results directly from the electron trapping/detrapping (within MWCNTs) dynamics — the hypothesis proven based on the numerical calculations carried out for an appropriate equivalent circuit. Even more surprisingly, the application of mathematical formalism, used typically for the analysis of biological structures (defined within the SRDP and the STDP models) reveals that the characteristic time constants describing the device fit almost perfectly with the values observed for neural structures in living organisms. Even more complex neuromimetic behavior may be implemented with the use of intrinsic charge carriers trapping/detrapping dynamics of nanocrystalline cadmium sulfide rich with additional electronic states present in the forbidden band. It was demonstrated that even a very rudiment photoelectrochemical device made of ITO@PET electrode covered with CdS in a simple three-electrode setup with the addition of an optical input is capable of sophisticated pattern recognition tasks realized typically by software Artificial Neural Networks — in this particular case the recognition of handwritten digits was carried out — with relatively high energy efficiency and satisfactory (taken into consideration the simplicity of the used system) separability.268 In this study, the non-trivial interplay between thermodynamics and kinetics of charge carriers generation and trapping events leads to a similar short-term memory effect as in the above-mentioned case, but here the train of light pulses is used as the stimuli encoding individual rows of pixels constituting a particular digit.

page 44

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics

b4205-v2-ch01

45

The dynamical response of the device, exhibiting an increasing intensity of subsequent photocurrent spikes generated within CdS (which results from various time constants characterizing processes involving two types of trapping states and the interfacial electron transfer), may be analyzed through the application of different threshold levels. Since encoded digits vary distinguishably in terms of pixels dispersion, the number of events, for which a specific threshold is exceeded fluctuates between the rows characteristically for each digit (Figure 1.9). When compared with a simple pixel counting method, (b) (a) (c)

(d)

(e)

Figure 1.9. A handwritten character (a) encoded into a train of light pulses row by row (b). The resulting series of photocurrent spikes with three thresholds levels (c). The result of reconstruction (d). A simplified mechanism responsible for the short-term memory effect exhibited by nanocrystalline CdS (e). Adapted from Ref. [268].

page 45

August 3, 2021

46

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

one observes a significant increase in the recognition capability related to the use of proposed system.

1.5. Alternative Computing in Autonomous Robotics The social request for the autonomous walking robot is growing. For example, in order to prevent secondary disasters during lifesaving, the autonomous robot is necessary to search and rescue in the harsh situations where humans cannot physically intervene.269 Conventional autonomous walking robots had used the system modeling an assumed behavior pattern beforehand and choosing an action depending on the situation.270 However, it is difficult to adapt to the unexpected situation because it is necessary for the conventional robot to assume enormous behavior patterns, and the action not modelled is impossible. Therefore, a new control system is necessary for choosing an appropriate action depending on the situation for the autonomous walking robot in order to walk in unknown environment. Recently, natural computing inspired by physical systems in nature and biology271 have attracted attention in computer hardware research field in terms of the efficient solution search for intractable problems. An interesting example is an amoeboid organism; it optimizes the body shape to maximize intake of the bait through trial and error.272 Aono et al. developed amoeba-based solution search system, utilizing the search ability of the amoeboid organism and demonstrating the computing ability of the system by solving traveling salesman problem (TSP).273 In this section, we describe our amoeba-inspired autonomous walking robot that implements the amoeba-inspired electronic solution search system “electronic amoeba”.274 This robot successively searches for the appropriate footwork step by step to traverse uneven ground without any programming and pre-learning how to walk. Then we describe our approach to obtain environmental information and to take action utilizing physicality of the robot. Finally, we mention the amoeba-inspired autonomous control combined with reinforcement learning for achieving both adaptivity and efficient movement.

page 46

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

From Oscillatory Reactions to Robotics

b4205-v2-ch01

47

1.5.1. Amoeba-based solution search system and electronic amoeba The electronic amoeba implemented for the autonomous robot control is inspired by the amoeba-based solution search system.272 There are four essences of the amoeboid organism in terms of the searching: spreading pseudopods for maximizing intake of bait, avoiding harmful light, fluctuations in motion, and volume conservation. By utilizing these elements, the amoeboid organism achieves efficient solution search for the optimization problems. The electronic amoeba electronically implements the essence using an analog and/or digital electronic system. Figure 1.10(a) shows a schematic illustration of the amoebabased solution search system. The amoeboid organism placed on the (a)

(b)

Figure 1.10. Schematic illustrations of (a) an amoeba-based solution search system for solving optimization problem and (b) an electronic amoeba mimicking the essential dynamics of the amoeboid organisms.

page 47

August 3, 2021

48

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

chip having grooves filled with agar spreads all pseudopods along the grooves to increase bait intake. When a groove lane is selectively irradiated by light, the amoeboid organism shrinks the pseudopod along the lane. The state variable Xi is assigned to ith groove; when the pseudopod extends to the groove i, Xi = 1, otherwise Xi = 0. The amoeboid organism searches for the state that maximizes the bait intake by expanding and shrinking the pseudopods in the chip. To map the problem onto the system, we define a bounceback rule, which is a set of the light irradiation rules depending on the state variables. The bounceback rule is designed to prohibit the amoeboid organism from taking the state variables that violate the constraints of the problem. Deformation of the amoeboid organism and bounceback by the light irradiation are alternately proceeded. After that, the amoeboid organism becomes stable over time. The state variable at this time corresponds to the solution; there is no variable that violates the bounceback rule and the constraints are satisfied. The electronic amoeba (Figure 1.10(b)) follows the process of the amoebabased system, however, it can proceed in the process much faster than the amoeba organism. There are several options to map the bounceback rule for the electronic amoeba. A convenient way is to use a micro-controller having many I/O terminals. In many cases, the integrated development environment (IDE) is available and the rule can be directly written using Boolean operators implemented in the programming language. It is also easy to rewrite the instance. The state variable values of the amoeba core are sampled and they are processed in a digital manner. A disadvantage is that synchronous operation under clocking might cancel unique dynamics of the amoeba core working as an analog electronic circuit. Another option is full analog circuit implementation including the bounceback rule unit. Figure 1.11 shows a system with a crossbar implementing of the bouceback rule for maximum cut (Max-Cut) and TSP.275 In these problems, an instance of the problem is represented by a weighted graph. The graph is physically implemented using a fully connected crossbar with resistors at appropriated cross points. The memristor crossbar

page 48

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

From Oscillatory Reactions to Robotics

49

Instance-mapping crossbar R1,0

X1,1 R1,1

...

X1,0

RN,0

XN,0

RN,1

XN,1 Rref Vref

L1,0

L1,1

...

LN,1

LN,0

X1,1

XN,0

XN,1

X1,0

Amoeba core Figure 1.11. Electronic amoeba with a crossbar circuit for solving the Max-Cut and TSP problems.

is expected for reconfiguration of the graph, that is, rewrite the instance. Analog circuit operation of the whole system can fully utilize the dynamics of the electronic circuits, which should achieve efficient solution search with less power consumption. 1.5.2. Amoeba-inspired autonomously walking robot The amoeba-inspired autonomous walking robot that we have developed consists of a commercially available four-legged robot and an electronic amoeba which is implemented in the microcomputer with amoeba-inspired solution search algorithm. Figure 1.12 shows the photograph of the robot. This robot has only a touch sensor on each toe and the sensor detects the ground contact of the leg. The electronic amoeba searches the leg bending motion as a variable

page 49

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

M. Lis et al.

50 (a)

(b)

Figure 1.12. Amoeba-inspired autonomous walking robot: (a) photograph and (b) leg bending motions used in the robot walking.

of the optimization problem. There are six possible states for each leg and total 64 combinations are available as footwork for the robots. The electronic amoeba searches for the appropriate footwork at each walking step depending on the conditions of the ground and the robot. To find optimal footwork for walking, it is necessary to formulate the constrain in terms of walking. We designed the bounceback rules so that the robot avoided the leg motions that led to falling or retreating based on information of the leg bending and the ground contact. We prepared 28 bounceback rules; for example, the rule prohibits an unstable posture that retracts two or more feet simultaneously. When the electronic amoeba finds a stable state that satisfies all rules, the robot stops walking. Therefore, to keep the robot walking, we added another mechanism to the algorithm that randomly canceled the bounceback with probability W = {wi } , wi ∈ [0, 1] at each step unless the robot fell or retreated. We selected eight rules as a subset R from the prepared 16 bounceback rules and ith rule, Ri , was weighted by wi (Table 1.1).

page 50

August 3, 2021

17:50

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch01

From Oscillatory Reactions to Robotics Table 1.1. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

51

List of bounceback rules for autonomous waking.

When the robot starts, then reset all bounceback rules. If Leg(i) in direction (j) is up at time (t), prohibit Leg(i) from down at the same time. If Leg(i) in direction (j) is down at time (t), prohibit Leg(i) from up at the same time. If number of legs with sensor on I2 , False if I1 < I2 ) (reproduced with permission from Ref. [19]) and (b) interconnection of chemical ON–OFF switches, in this example the firing of the j switch (Aj True) inhibits the firing of the subsequent i switch (reproduced with permission from Ref. [20]).

In the 1990s, research focused on realizing experimentally chemical diodes21 and single Boolean gates (e.g., AND,21, 22 OR,21, 22 XOR23 gates). In the last decades, emphasis has been placed on connecting these simple gates to attempt more complex information processing, such as pulse counters,24 3-valued logical gates,25 arithmetic circuits26 and gate arrays.27 Different compartmentalization strategies have been used to connect these chemical gates by pure diffusion, like liquid droplets,27–29 liposomes,30 and liquid marbles31 (BZ solution coated with hydrophobic powder), to name a few. Still today the connection of chemical gates poses challenges in scaling up and achieving robust computations. Experimental instances of computations using chemistry and not relying explicitly on diffusion have also been achieved with biochemistry, with DNA computing. Small instances of well-known combinatorial problems like the Hamiltonian path problem have been demonstrated32 using DNA computation. However, DNA computing involves human-assisted protocols and multi-step multi-vessel operations, and there are still unresolved challenges related to reusability, scalability, and visualization of the results. It is important to mention the theoretical result for computation with chemistry represented by the proof by Magnasco that chemistry is Turing complete.33

page 86

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch02

Computing by Chemistry: The Native Chemical Automata

87

Figure 2.2. Abstract automata hierarchy with the three major classes of automata (FA, PDA, TM), the languages they recognize, the chosen languages for the implementation of chemical automata and the actual chemistries used for each of the implementations (reproduced with permission from Ref. [3]).

Recently, experimental realizations of automata based in chemistry have been achieved for the three major classes of automata in the automata hierarchy2 (see Figure 2.2). Hence, a chemical FA recognizing the language L1 of all words that contain at least one a and one b was experimentally implemented using a bimolecular precipitation reaction and transcribing the alphabet symbol a to chemistry as an aliquot of potassium iodate and b as an aliquot of silver nitrate. If during computation a white precipitate is observed (silver iodate), the input is accepted, otherwise rejected. A chemical 1-stack PDA recognizing the language L2 of balanced parentheses was implemented with a pH reaction network, assigning an aliquot of NaOH base as “(” and an aliquot of malonic acid as “)”, such that if both are added the pH reaches the first midpoint. Hence, an input word is accepted (i.e., the parentheses are matched) if during the computation the pH is larger or equal to the midpoint, and

page 87

August 3, 2021

88

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch02

M. Due˜ nas-D´ıez and J. Pérez-Mercader

exactly at midpoint once all letters have been processed. Finally a chemical TM recognizing the language L3 = {an bn cn , n > 0} was implemented using an oscillatory chemical reaction, the Belousov– Zhabotinsky reaction, with the following assignment of chemicals: (oxidizer) sodium bromate as a, (reducer) malonic acid as b, and (pH-altering) sodium hydroxide as c. Each of these symbols affect in distinct ways the patterns of relaxation oscillations in the redox potential since they enhance predominantly different pathways in this rich reaction network. (Note also that the order of the chemicals affects the timing and concentration of radicals produced by BZ during the course of computation.) 2.3. How Native Chemical Automata are Built in Practice Building a native chemical automaton involves several steps. We need to assign by a judicious choice the language (determined by the computation we need to do), the transcription into chemistry of the alphabet symbols with which to express the words in the language by means of chemical aliquots, and the criterium for acceptance or rejection. It is important to recall that all of these have important consequences in both the chemical and the physical implications of the features associated with the accept/reject criteria. Let us take a closer look at each of these steps. 2.3.1. Selecting the language-automata pair, and the chemistry The first step is to decide on the computational problem to be solved, or equivalently, to choose the language that the automaton will recognize. This establishes the number of alphabet symbols that are needed. Then, by making use of the extensive theoretical corpus available on abstract automata,6, 7, 34, 35 we can identify the class of abstract automata capable of recognizing the chosen language as well as the set of rules or algorithm (i.e., the automata’s tuple, in automata jargon) that are required. Special attention needs to be paid to the number of rules (i.e., state transitions) required to carry

page 88

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch02

Computing by Chemistry: The Native Chemical Automata

89

out the computation, which will guide chemistry selection, since the chemistry should have at least as many distinct chemical states or chemical signatures as required by the set of rules. The first example we can study is the regular language L1 of all words that contain at least one a and one b. From automata theory we know that a finite automaton can recognize this language, and the set of rules is fairly simple, requiring just three distinct states in addition to the initial state: one for accept, and two reject states (only a, or only b), see Figure 2.3(a). These requirements lead us to realize that any bimolecular reaction suffices for this computation, and in particular, a precipitation reaction is convenient for its visual representation of the states.2 The next classical language that was realized experimentally in our work2, 9 is the Context–Free Dyck language L2 , or language of balanced parentheses, having as alphabet symbols “(”, “)” plus the (necessary in computer science theory) beginning and end-ofsequence symbol “#”. From automata theory it is known that a 1-stack PDA can recognize it, and that there are four states in addition to the initial state (see Figure 2.3(b)): two are reject states (excess “)”, and excess “(”) and only one is an accept state. In chemical terms, the stack requires an intermediate in the reaction network whose concentration can be increased or decreased (equivalent to pushing and popping an element in the stack, respectively). All these requirements are fulfilled by a pH reaction network, in which the pH acts as stack, and its different levels (low, midpoint, high) can be assigned to the states of the transition table.2 Finally, the most advanced language implemented in native chemical automata is the context-sensitive L3 = {an bn cn , n > 0}. Automata theory establishes that a TM is required, or equivalently, a PDA with two stacks, to recognize L3 . This language is from an alphabet with three symbols (a, b, c), plus the beginning and end-of-sequence symbol (“#”). There are many transition rules and states for this language: 10 states in addition to the initial state, of which five are distinct reject states and only 1 is the accept state (see Figure 2.3(c)). The need for two interacting stacks and high nonlinearity with multiple states led to the choice of nonlinear

page 89

August 3, 2021

90

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch02

M. Due˜ nas-D´ıez and J. Pérez-Mercader

Figure 2.3. Transition graphs for (a) the precipitation-based FA recognizing L1 ; (b) the pH-based PDA recognizing L2 and (c) the BZ-based TM recognizing L3 . The reject states are highlighted in red dashed line, and the accept state in green line. Annotations are made with the chemical analogue to the state.

oscillatory chemistry and, in particular, to the Belousov–Zhabotinsky reaction, where two systematic descriptors of the oscillation, the frequency f and an amplitude-related metric D, represent the operation of the two stacks.2, 9, 36

page 90

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch02

Computing by Chemistry: The Native Chemical Automata

91

2.3.2. Initial conditions and alphabet symbol assignment Next is to implement the assignment of initial conditions and alphabet symbols. As a rule of thumb, the best candidates to assign a chemical species to an alphabet symbol are the main reactants and, perhaps if needed, the most interconnected intermediates (or chemical species directly affecting them) in the reaction network. For L1 and L2 , the assignment of symbols to the two reactants in the overall respective reactions was direct and straightforward, and in both cases the initial conditions just required water at ambient conditions for L1 and controlled temperature for L2 . “#” requires a chemical species that can highlight/enhance the output of the computation, hence for L2 , a pH-indicator with distinct colors for low/midpoint/high pH levels was chosen. For L3 , and its recognition by the BZ oscillatory network, the assignment of alphabet symbols requires a more elaborate thinking, and kinetic simulations can be very useful in the selection.2, 36 Each of the symbols needs to enhance predominantly different pathways in this rich reaction network, which result in a distinct and specific effect on the non-oscillatory and oscillatory responses and providing characteristic signatures for the different transition states2, 36 (see Figure 2.3(c)). The chosen assignment for our implementation in the previous reference was: a = sodium bromate, b = malonic acid; c = sodium hydroxide; # = Ruthenium catalyst. The initial conditions are provided by a sulfuric acid solution, and since the reaction is very sensitive to temperature and stirring conditions, these two variables (parameters of the automaton) were set constant by a thermal bath and magnetic stirring. 2.3.3. Recipe quantification and selection of time interval Once chemical species are assigned as the alphabet symbols, the next step is to quantify the concentration and volume of their aliquots. Aliquot selection may require optimization, that can be carried out experimentally, or preferably, guided by the reaction mechanism and

page 91

August 3, 2021

92

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch02

M. Due˜ nas-D´ıez and J. Pérez-Mercader

numerical simulations. Aliquots should be concentrated enough to lead to measurable/observable changes of the chemical states, but not so large that would deplete any of the intermediates limiting the maximum length of the sequences that can be processed by the automaton. In the L1 and L2 realizations, the stoichiometry of the reactions guided the quantification of the aliquots. Hence, for L1 , aliquots were chosen such that once an a and b have been inputted, the product of their concentrations exceeds the solubility limit of their product silver iodate, such that enough precipitate is formed and becomes visible to the detection system, the naked eye2, 9 for this example. For L2 , the aliquots were chosen such that once an “(” and a “)” are inputted, they neutralize each other and drive the pH to the midpoint.2, 9 For L3 , a combination of simulations and experimental fine-tuning guided the quantification of the aliquots.2, 36 The time interval, during which the computational processing of any symbol occurs, is another key design parameter of the automaton’s performance.37 The longer the time interval, the slower the computations, but the more robust the accept/reject performance and the lesser the risk of the aliquots not having sufficient time to affect the intended pathways in the reaction. Hence, choosing τ , this interval, is non-trivial and requires balancing computational, chemical, chemical engineering and statistical considerations.37 2.3.4. Accept/reject criteria optimization Once a working recipe is established, the final stage is to evaluate the performance of accept/reject criteria and, if needed, adjust the recipes of the aliquots and reactor to improve performance. For the chemical FA and PDA, the accept/reject is visual and intuitive. The appearance/absence of a precipitate signals the accept/reject, respectively, for the chemical FA, see Figure 2.4(a). The solution color in the chemical PDA signals the accept (orange), reject to out-of-order “(” (magenta) or reject due to excess “(” (yellow), see Figure 2.4(b). Of course, these “signs” can also be expressed chemically or physically to trigger other processes or automatons as well.

page 92

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch02

Computing by Chemistry: The Native Chemical Automata

93

Figure 2.4. Chemical accept/reject criteria for the different automata: (a) The chemical FA accepts the input sequence if a visible white precipitate is formed; (b) The pH-based PDA accepts the input sequence if during computation the solution color remains yellow or orange, and is orange once the final # is processed: (c) The BZ-based TM accepts the sequence if the oscillatory features at the end of the # processing lie in the nonlinear locus [D# , f# ]. Adapted and reproduced with permission from Ref. [2].

For the chemical TM the first accept/reject criteria used the oscillatory features at the end of computation, [D# , f# ], see Figure 2.4(c). For components of the words in the language accepted by the TM, there is a nonlinear relationship between these two metrics, while rejected sequences lie away from this nonlinear locus. To facilitate the accept/reject interpretation, a physically inspired criterion, analogous to part of the action in physics, and which we called the area A{word} linearized the values with respect to the sequence length. With a combination of simulations and experimental optimization, this criterion was rendered constant for all words in language.2, 9, 36 It is also important to remark that the trajectory (path) in action-like space followed by the chemical TM when accepting words in the language are very different for the trajectories

page 93

August 3, 2021

94

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch02

M. Due˜ nas-D´ıez and J. Pérez-Mercader

followed by the reaction when processing words that will be rejected. The path for accepted words corresponds to both maximal entropy production and fastest energy dissipation rates.36 This intriguing property is important for applications to artificial biology. 2.4. Reconfiguration and Variants/Extension of Native Chemical Automata 2.4.1. Inclusive hierarchy and automata reconfigurability In full correspondence to the situation in abstract automata theory, native chemical automata conform an inclusive hierarchy, that is, automata can recognize languages at its level and all the levels lower than its own in the hierarchy.3 Hence, the pH-based 1-stack PDA can be reconfigured to recognize L1 , and the BZ-based TM can be reconfigured to recognize L2 , and then to recognize L1 .3 The procedure to reconfigure involves the same steps described in Section 2.3, although the assignment of chemicals to symbols become simpler since there is previous experience with symbol assignment for other languages and their effect on the pathways of the reaction mechanism. Typically, the steps that would require more attention are the recipe quantification and the optimization of accept/reject criteria. 2.4.2. Extension to continuous operation (CSTR reactor) The examples described above, including the reconfigurations of Section 2.4.1, were all run in semi-batch reactors with sequential addition of the input, and zero outflow. An attractive and important extension of native chemical automata is to use a flow reactor instead of a batch reactor. More specifically, a Continuously Stirred Tank Reactor (CSTR). The BZ-TM was successfully run in CSTR to recognize L3 demonstrating the following additional features which clearly translate into advantages36 :

page 94

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch02

Computing by Chemistry: The Native Chemical Automata

95

(1) The automaton auto-erases and resets to initial conditions (i.e., ready for a new computation) once the sequence is fully processed and after a time of the order of three residence times has elapsed. (2) The time interval can be reduced (i.e., the computational speed can be increased) because the initial conditions are in an oscillatory regime (the steady-state oscillations) and there is no induction period necessary to elapse during the processing of the sequence. (3) And, last but certainly not least, running BZ in CSTR can lead to periodic and non-periodic responses which are not attainable in batch, and therefore can dramatically expand the number of features that can be assigned to states and used for accept/reject criteria. The procedure to design a native chemical automaton in CSTR follows the same guidelines as explained in Section 2.3. However, the initial and flow conditions are chosen such that the reaction is within a stationary periodic regime. Moreover, the ratio between the residence time and the time interval between aliquots should be chosen with care, since the maximum word sequence length that can be reliably processed without partial erasure is proportional to such ratio.36 2.4.3. Coupling of Belousov–Zhabotinsky to self-assembly The objective of carrying out a computation is transforming an input information into an output form of chemical information, which can then be further used. In computing with chemistry, since the output of the computation can always be a specific chemical signature, it can drive further chemical changes in a coupled chemical system, or, alternatively, physical changes (e.g., precipitation) or other kind of event, such as dissipative events (e.g., self-assembly). Belousov–Zhabotinsky can be easily coupled to other chemistries, due to its active reduction–oxidation and pH-dependent nature, as

page 95

August 3, 2021

96

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch02

M. Due˜ nas-D´ıez and J. Pérez-Mercader

well as the presence of several highly reactive intermediates, including radical species, brominating agents, etc. One of the chemistries of interest to couple to BZ is polymerization leading to self-assembled functional objects, due to its significance in artificial biology and origins of life context.38–41 Beautifully, BZ was indeed shown to be caused by a free radical mechanism by observing that the oscillations were first inhibited in presence of acrylonitrile monomer42 in the reaction vessel. Later, it was observed that such a reaction mixture leads to polymerization of the acrylonitrile and to modified (from those in the absence of a polymerization monomers) oscillatory features in BZ.43 Recently, it has been demonstrated experimentally that BZ can drive polymerization-induced self-assembly (PISA) as well.38 PISA consists in the chain-extension of a soluble polymer chain with a second monomer (soluble as monomer but solvophobic as polymer) to form an amphiphilic diblock copolymer. As polymerization progresses and the solvophobic chain grows longer, the amphiphile autonomously undergoes self-assembly into a range of morphologies (see Figure 2.5). Hence, BZ-driven PISA has already been successfully run with several monomers and initial macro-chain transfer agents, and has already been demonstrated to provide in an aqueous medium not only the typical PISA morphologies but

Figure 2.5. Illustration of BZ-driven PISA process. BZ oscillations generate radicals leading to polymerization. As polymerization progresses the solvophobic part of the amphiphile grows longer, leading to autonomous self-assembly. Moreover, the self-assemblies change, also in an autonomously manner. Image elements including TEM images are adapted and reproduced with permission from Ref. [46].

page 96

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch02

Computing by Chemistry: The Native Chemical Automata

97

also giant (μ-sized) polymersomes38, 44–46 and multi-compartment systems.47 BZ-PISA is typically run in a batch reactor, but in order to control and tune simultaneously the morphology of the self-assembled objects and the oscillatory features of their cargo, a continuouslystirred tank reactor is more appropriate.46 All these results/findings pave the way to use native automata to control the self-assembly, either regarding morphology, size and/or computational capabilities of the cargo in the self-assembled objects. 2.5. Conclusions and Outlook Computing with chemistry takes full advantage of the fact that chemical reactions are equivalent to recognition events where molecules, and the sequence in which they are fed to a reaction, are recognized and mechanically transformed into other molecular species with capabilities different from those of the “input” molecules. The function of “mechanical transformation” is done by the chemical reaction. Just as in theoretical computer science there are physically implementable abstract automata which can be classified into an inclusive hierarchy according to their complexity, the Chomsky hierarchy, so there are in chemistry. The chemical automata and the reactions representing and implementing them are experimentally shown to conform to the Chomsky hierarchy of automata and the languages they recognize. Fully chemical automata are necessarily limited in the length of tape they can implement, because an unbounded tape-length requires access to infinite resources for operation.48 Something that cannot be supported with matter constrained by the energy–momentum tensor. However, because of Geffert’s theorems,49 the access to the top levels of the languages recognized by the TM class of automata can also be achieved once a two stack push down chemically realized automaton is available. Oscillatory chemistry with its internal autocatalysis and complementary radicals implements them. There is no need to use biomolecules, as small non-biochemical molecules support chemical (relaxation) oscillations. Thus, experimentally one can construct automata at all levels in the hierarchy, which provides experimental

page 97

August 3, 2021

98

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch02

M. Due˜ nas-D´ıez and J. Pérez-Mercader

confirmation of the theoretical proof that “chemistry is Turing complete”.33 Because of the above, Feynman’s requirements for simulating chemistry with chemistry are fulfilled: by using native chemical computing, chemistry can be simulated by chemistry. This opens an important door for the application of chemical computation in emerging areas at the interface between chemistry and computer science, including artificial intelligence and machine learning. But it also uniquely opens the door for the design and construction of new material systems which are autonomously built at the molecular level by molecules themselves using parts available in their environments and have the ability to implement complex functions such as system self-replication or adaptation. Although much more needs to be done, some initial steps towards these exciting developments have already been demonstrated by the application of the Belousov–Zhabotinsky oscillatory chemical reaction to build a TM and also to build and operate the polymerization induced self-assembly of primitive selfdividing molecular structures. The acceptance of words in the language of a native chemical automaton is associated with paths followed by an action-like quantity which imply that for the words in the language the energy is and entropy dissipation rates are extremal and maximum, that is, fastest. It is surprising and at the same time encouraging that the native chemical automata can be run in simple table-top onepot reactors and they do not require neither complex setups nor tailor-made devices as in other chemical computing approaches. Indeed, Figure 2.6 shows the evolution of the setup of the BZbased TM over time: only standard glassware was required for the proof of concept and even on one of the latest setups, the CSTR implementation, requires nothing more than standard over-the-shelf laboratory equipment like a jacketed reactor, thermal bath, and redox potential metering. Many difficulties, challenges and opportunities lay ahead where native chemical computation can contribute to interesting/mayor applications. For example, the following questions provide a very

page 98

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch02

Computing by Chemistry: The Native Chemical Automata

99

Figure 2.6. Evolution over time of the BZ-based TM setup: from the proof-ofconcept experiments (leftmost picture) to the CSTR BZ-based TM. Standard commercial stock glassware and devices are used, not requiring complex setups or specially-made devices.

short list of pressing questions: (1) how can one disentangle the computing power of 1023 highly correlated molecules processing in parallel? (2) how can one implement reversible computing? (3) how can one implement structures such as neural networks and make progress in the emulation of the human brain by using synthetic artificial chemical systems? These are but three examples of exciting research areas which become open as a consequence of the availability of native chemical computation. References 1. R. P. Feynman, Simulating physics with computers, Int. J. Theor. Phys. 21, 467–488 (1982). 2. M. Due˜ nas-D´ıez and J. Pérez-Mercader, How chemistry computes: language recognition by non-biochemical chemical automata. From finite automata to Turing machines. iScience 19, 514–526 (2019), doi:10.1016/j.isci.2019.08.007. 3. M. Due˜ nas-D´ıez and J. Pérez-Mercader, In-vitro reconfigurability of native chemical automata, the inclusiveness of their hierarchy and their thermodynamics. Scientific Reports 10, 6814 (2020), doi:10.1038/s41598-020-63576-6. 4. T. Sienko and J. M. Lehn. Molecular Recognition: Storage and Processing of Molecular Information, in T. A. Sienko, Adamatzky, N. Rambidi and M. Conrad (eds.), Molecular Computing (MIT Press, 2003). 5. T. E. H. Allen, How do machines learn?. In Machine Learning in Chemistry: The Impact of Artificial Intelligence, (2020), pp. 16–36, doi:10.1039/9781839160233-00016. 6. J. E. Hopcroft, R. Motwani, and J. D. Ullman, Introduction to Automata Theory, Languages, and Computation (3rd edn.) (Pearson Education Inc, 2007). 7. D. Cohen, Introduction to Computer Theory, 2nd edn. (John Wiley & Sons, Inc., New York, 1991).

page 99

August 3, 2021

100

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch02

M. Due˜ nas-D´ıez and J. Pérez-Mercader

8. A. M. Turing, On computable numbers with an application to the Entscheidungs-problem. Proc. Lond. Math. Soc. 2, 230–265 (1936). 9. M. Due˜ nas-D´ıez and J. Pérez-Mercader, Native chemical automata and the thermodynamic interpretation of their experimental accept/reject responses. In D. H. Wolpert, C. Kempes, J. A. Grochow and P. F. Stadler (eds.), The Energetics of Computing in Life and Machines (SFI Press, Santa Fe, 2019), pp. 119–139. 10. Y. Benenson, T. Paz-Elizur, R. Adar, E. Keinan, Z. Livneh, and E. Shapiro, Programmable and autonomous computing machine made of biomolecules. Nature 414(6862), 430 (2001). 11. J. Pérez-Mercader, M. Due˜ nas-D´ıez, and D. Case, U.S. Patent No. 9,582,771. Washington, DC: U.S. Patent and Trademark Office (2017). 12. H. Goldstine, The Computer from Pascal to von Neumann (Princeton University Press, 1972). 13. M. Gardner, Logic Machines, Diagrams and Boolean Algebra (Dover Publications, 1968). 14. M. d’Ocagne, Le calcul simplifie (1905), p. 95, as cited by Howard Aiken in A Manual of Operation for the Automatic Sequence Controlled Calculator (1946) page 7, available at http://sites.harvard.edu/∼chsi/markone/manual. html 15. S. Ramon y Cajal, The structure and connection of neurons. In Nobel Lectures: Physicology or Medicine, 1901–1921 (Amsterdam, Elsevier, 1906), pp. 220–253. 16. H. Bruderer, in https://cacm.acm.org/blogs/blog-cacm/247429-ai-began-in1912/fulltext?mobile=false, January 4, 2020. 17. W. S. McCulloch and W. Pitts, A logical calculus of the ideas immanent in nervous activity. The Bull. Math. Biophys. 5(4), 115–133 (1943). 18. M. Conrad, Information processing in molecular systems. BioSystems 5(1), 1–14 (1972). 19. M. Okamoto, T. Sakai, and K. Hayashi, Switching mechanism of a cyclic enzyme system: Role as a ‘chemical diode’. Biosystems 21(1), 1–11 (1987). 20. A. Hjelmfelt, E. D. Weinberger, and J. Ross, Chemical implementation of neural networks and Turing machines. Proc. Natl. Acad. Sci. USA 88, 10983– 10987 (1991). 21. K. Agladze, R. R. Aliev, T. Yamaguchi, and K. Yoshikawa, Chemical diode. J. Phys. Chem. 100(33), 13895–13897 (1996). 22. A. T´ oth and K. Showalter, Logic gates in excitable media. J. Chem. Phys. 103(6), 2058–2066 (1995). 23. A. Adamatzky and B. D. L. Costello, Experimental logical gates in a reactiondiffusion medium: The XOR gate and beyond. Phys. Rev. E 66(4), 046112 (2002). 24. J. Gorecki, K. Yoshikawa, and Y. Igarashi, On chemical reactors that can count. J. Phys. Chem. A 107(10), 1664–1669 (2003). 25. I. N. Motoike and A. Adamatzky, Three-valued logic gates in reaction– diffusion excitable media. Chaos, Solitons & Fractals 24(1), 107–114 (2005).

page 100

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch02

Computing by Chemistry: The Native Chemical Automata

101

26. A. Adamatzky, Binary full adder, made of fusion gates, in a subexcitable Belousov–Zhabotinsky system. Phys. Rev. E 92(3), 032811 (2015). 27. A. L. Wang, J. M. Gold,, N. Tompkins, M. Heymann, K.I. Harrington, and S. Fraden, Configurable NOR gate arrays from Belousov-Zhabotinsky microdroplets. The Eur. Phys. J. Spec. Topics 225(1), 211–227 (2016). 28. J. Delgado, N. Li, M. Leda, H. O. Gonzalez-Ochoa, S. Fraden, and I. R. Epstein, Coupled oscillations in a 1D emulsion of Belousov–Zhabotinsky droplets. Soft Matter 7, 3155–3167, (2011), doi:10.1039/c0sm01240h. 29. K. Gizynski and J. Gorecki, Chemical memory with states coded in lightcontrolled oscillations of interacting Belousov–Zhabotinsky droplets. Physical Chem. Chem. Phys. 19(9), 6519–6531 (2017). 30. R. Tomasi, J-M. Noel, A. Zenati, S. Ristori, F. Rossi, V. Cabuil, F. Kanoufi, and A. Abou-Hassan, Chemical communication between liposomes encapsulating a chemical oscillatory reaction. Chem. Sci. 5, 1854–1859 (2014), doi:10.1039/C3SC53227E. 31. T. C. Draper, C. Fullarton, N. Phillips, B. P. de Lacy Costello, and A. Adamatzky, Liquid marble interaction gate for collision-based computing. Materials Today 20(10), 561–568 (2017). 32. L. M. Adleman, Molecular computation of solutions to combinatorial problems. Science 266(5187), pp. 1021–1024 (1994). 33. M. O. Magnasco, Chemical kinetics is Turing universal. Phys. Rev. Lett. 78(6), 1190 (1997). 34. E. Rich, Automata, Computability, and Complexity. Theory and Applications (Pearson/Prentice-Hall, New Jersey, USA, 2008). 35. P. Linz, An Introduction to Formal Languages and Automata, 5th edn. (Jones & Bartlett Learning, 2012). 36. M. Due˜ nas-D´ıez and J. Pérez-Mercader, Native chemical computing. An application of Belousov-Zhabotinsky chemistry in its oscillatory regime. Front. Chem. 9, 611120 (2021). 37. T. C. Draper, M. Due˜ nas-D´ıez, and J. Pérez-Mercader, The importance of the ‘time interval’ in a Belousov-Zhabotinsky operated chemical Turing Machine. RSC Advances (2021), doi:10.1039/d1ra03856g. 38. B. P. Bastakoti and J. Pérez-Mercader, Facile one-pot synthesis of functional giant polymeric vesicles controlled by oscillatory chemistry. Angew. Chem. Int. Ed. 56, 12086–12091 (2017). 39. G. Cheng and J. Pérez-Mercader, Polymerization-induced self-assembly for artificial biology: opportunities and challenges. Macromolecular Rapid Communications 40(2), (2019), doi:10.1002/marc/201800513. 40. S. Pearce and J. Pérez-Mercader, PISA: construction of self-organized and self-assembled functional vesicular structures. Polym. Chem. (2020), doi:10.1039/D0PY00564A. 41. J. Pérez-Mercader, De novo laboratory synthesis of life mimics without biochemistry. In ALIFE 2020: The 2020 Conference on Artificial Life (2020), pp. 483–490, doi:10.1162/isal a 00282.

page 101

August 3, 2021

102

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch02

M. Due˜ nas-D´ıez and J. Pérez-Mercader

42. Z. V´ aradi and M. T. Beck, Inhibition of a homogeneous periodic reaction by radical scavengers. J. Chem. Soc. Chem. Commun. 2, 30–31 (1973). 43. R. P. Washington, W. W. West, G. P. Misra, and J. A. Pojman, Polymerization coupled to oscillating reactions: (1) a mechanistic investigation of acrylonitrile polymerization in the Belousov-Zhabotinsky reaction in a batch reactor. J. Am. Chem. Soc. 121, 7373–7380 (1999). 44. B. P. Bastakoti and J. Pérez-Mercader, Autonomous ex novo chemical assembly with blebbing and division of functional polymer vesicles from a “homogeneous mixture”. Adv. Mater. 29, 1704368–1704373 (2017). 45. B. P. Bastakoti, S. Guragain, and J. Pérez-Mercader, Direct synthesis of hundred nanometer and beyond scale polymer vesicles using chemical oscillations. Chem. Eur. J 24, 10621–10624, (2018), doi:10.1002/chem.201801633. 46. L. Hou, M. Due˜ nas-D´ıez, R. Srivastava, and J. Pérez-Mercader, Flow chemistry controls self-assembly and cargo in Belousov-Zhabotinsky driven polymerization-induced self-assembly. Commun. Chem. 2(139) (2019), doi:10.1038/s42004-019-0241-1. 47. G. Cheng and J. Pérez-Mercader, Dissipative self-assembly of dynamic multicompartmentalized microsystems with light-responsive behaviors. Chem. 6(5) 1160–1171, (2020), doi:10.1016/j.chempr.2020.02.009. 48. M. L. Minsky, Computation: Finite and Infinite Machines (Prentice Hall, Englewoods Cliff, New Jersey, 1967). 49. V. Geffert, Normal forms for phrase structure grammars. Informatique théorique et applications 25, 473–496 (1991).

page 102

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235740 0003

Chapter 3

Discovering Boolean Functions on Actin Networks Andrew Adamatzky∗, , Stefano Siccardi∗ , Florian Huber† , Jörg Schnauß‡,∗,∗∗ and Jack Tuszy´ nski§,¶ ∗

Unconventional Computing Lab, Department of Computer Science and Creative Technologies, UWE, Bristol, UK † Netherlands eScience Center, Science Park 140, 1098 XG Amsterdam, The Netherlands ‡ Soft Matter Physics Division, Peter Debye Institute for Soft Matter Physics, Faculty of Physics and Earth Science, Leipzig University, Germany & Fraunhofer Institute for Cell Therapy and Immunology (IZI), DNA Nanodevices Group, Leipzig, Germany § Department of Oncology, University of Alberta, Edmonton, AB T6G 1Z2, Canada ¶ DIMEAS, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129, TO, Turin, Italy [email protected] ∗∗ [email protected] Actin filaments are conductive to ionic currents as well as mechanical and voltage solitons. Two families of computing devices are discussed: collision-based computing and voltage-based computing. In collision-based computing, we employ travelling localizations to generate computing circuits from actin networks. The propagation of localizations on a single actin filament is experimentally unfeasible to control. Therefore, we consider excitation waves propagating on bundles of actin filaments. In computational experiments with a 2D slice of an actin bundle network, we show that by using an arbitrary arrangement 103

page 103

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

A. Adamatzky et al.

104

of electrodes, it is possible to implement two-inputs-one-output circuits, {0, 1}n → {0, 1}n mapping finite state machine. In voltage-based computing, we consider the bundles as electrical wires with either low or high filament densities. A set of equations describing the network is solved with several initial conditions. Input voltages, which can be considered as information bits, are applied in a set of points and output voltages are computed in another set of positions. We consider both an idealized situation, where point-like electrodes can be inserted in any points of the bundles and a more realistic case, where electrodes lay on a surface and have typical dimensions available in the industry. We find that in both cases such a system can implement the main logical gates a finite state machine.

3.1. Introduction An idea to implement a computation by using collisions of signals travelling along 1D, nonlinear geometries can be traced back to the mid 1960s, when Atrubin developed a chain of finite-state machines executing multiplication,1 Fisher designed prime numbers generators in cellular automata,2 and Waksman proposed the eightstate solution for a firing squad synchronization problem.3 In 1986, Park, Steiglitz, and Thurston4 designed a parity filter in cellular automata with soliton-like dynamics of localizations. Their design led to a construction of a 1D particle machine, which performs the computation by colliding particles in 1D cellular automata, that is, the computing is embedded in a bulk media.5 Exploring ways to translate the purely theoretical ideas of collision-based computing6, 7 into nanocomputing at a subcellular level, we consider actin networks as ideal candidates for a computing substrate. The idea of subcellular computing on cytoskeleton networks has been firstly announced by Hameroff and Rasmussen in a context of microtubule automata in 1980s.8–10 Priel, Tuszynski and Cantiello analyzed how information processing could be realized in actin–tubulin networks of neuron dendrites.11 In the present chapter we focus purely on actin. Actin is a crucial protein, which is highly conserved throughout all eukaryotic cells. It is present in forms of monomeric, globular actin (G-actin) and filamentous actin (F-actin).12–14 Under the appropriate conditions, G-actin polymerizes into F-actin forming a double helical structure.15, 16 Signals in the actin networks could be

page 104

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Discovering Boolean Functions on Actin Networks

b4205-v2-ch03

105

represented by travelling localizations. The existence of the travelling localizations — defects, ionic waves, solitons — in cytoskeleton polymer networks is supported by (bio)-physical models.17–26, 55, 56 Why is actin more advantageous than other polymers for developing unconventional computers? We provided detailed answers in Ref. [27], which we briefly summarize below. DNA has proven to act well as a nanowire, however, no transformations of signals were observed. Tubulin microtubules can act as wires and signal amplifiers, however, there is no experimental evidence of voltage solitons propagating along the microtubules. Actin filaments display very high (comparing to DNA and microtubules) density charges (ca. 1.65×105e /μm) manifested by the extensive charges in electric dipole momentum.28 Actin filaments also display nonlinear, inhomogeneous transmission lines supporting propagation of nonlinear, dispersive waves and solitons.17–23, 25 Actin can even renew itself via polymerization and depolymerization, which can be further tuned with accessory proteins or bionic complexes.29 On the relevant length scales it is less structurally complex than DNA and therefore experimental prototyping is easier to achieve. Actin is a macro-molecular actuator,30, 31 which opens additional application domains of actin computing circuits — embedded controllers for molecular machinery. Furthermore, the investigated structures are especially suitable since they form by self-assembly settling into an energetic minimum. In this form, the structures can be stable over days even without additional treatment and reanneal quickly even after harsh mechanical deformations. Computational studies discussed the feasibility of implementing Boolean gates on a single actin filament32 and on an intersection of several actin filaments33 via collisions between solitons. Further studies applied a reservoir-computing-like approach to discover functions on a single actin unit34 as well as filament.35 In 2016, for instance, we demonstrated that it is possible to implement logical circuits by linking the protein chains.33 In such a setup, Boolean values are represented by localizations travelling along the filaments and the computation is realized via collisions between localizations at the junctions between the chains. We have shown that and, or, and not gates can be implemented in such setups. These gates can be

page 105

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

106

b4205-v2-ch03

A. Adamatzky et al.

cascaded into hierarchical circuits, as we have shown on an example of nor.33 The theoretical models developed so far addressed processing of information on a single actin unit or a chain of a few units. Whilst being attractive from a computing point of view, it appears difficult to implement this under experimental laboratory conditions. In the present work, we therefore developed an alternative version of the computing on actin networks by considering excitation waves propagating on bundles of actin filaments.36, 37 Not a single actin filament is considered, but an overall “density” of the conductive material formed by the actin bundles arranged by crowding effects without the need for additional accessory proteins.38–40 First results of this approach are presented in this chapter. 3.2. The Actin Network As a template for our actin droplet machine we used an actual 3D actin bundle network produced in laboratory experiments with purified proteins (Figure 3.1). The underlying experimental method was shown to reliably produce regularly spaced bundle networks from homogeneous filament solutions inside small isolated droplets in the absence of molecular motor-driven processes or other accessory proteins.41 These structures effectively form very stable and longliving 3D networks, which can be readily imaged with confocal microscopy resulting in stacks of optical 2D slices (Figure 3.1). Dimensions of the network are the following: size along x coordinate is 225 μm (width), along y coordinate is 222 μm (height), along z coordinate is 112 μm (depth), voxel width is 0.22 μm, height 0.22 μm and depth 4 μm. 3.3. Spike-based Gates 3.3.1. Automaton model Original image: Az = (aijz )1≤i,j≤n,1≤z≤m, aijz ∈ {rijz , gijz , bijz }, where n = 1024, m = 30, rijz , gijz , bijz are RGB values of the element at ijz, 1 ≤ rijz , gijz , bijz ≤ 255 was converted to a conductive matrix

page 106

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Discovering Boolean Functions on Actin Networks

b4205-v2-ch03

107

Figure 3.1. Exemplary z-slices of a 3D actin bundle network formed as described in Ref. [41].

C = (cijz )1≤i,j≤n,1≤z≤m as follows: cijz = 1 if rijz > 40, gijz > 19 and bijz > 19. The conductive matrices are shown in Figure 3.2. The 3D conductive matrix is compressed along z-axis to reduce consumption of computational resources.

page 107

August 3, 2021

108

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

A. Adamatzky et al.

Figure 3.2. Exemplary z-slices of “conductive” geometries C selected from the 3D actin bundle network shown in Figure 3.1, which were formed as described in Ref. [41].

To model activity of an actin bundle network, we represent it as an automaton A = C, Q, r, h, θ, δ. C ⊂ Z3 is a set of voxels, or a conductive matrix C. Each voxel p ∈ C takes states from the set Q = {, •, ◦}, excited (), refractory (•), resting (◦) and

page 108

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Discovering Boolean Functions on Actin Networks

b4205-v2-ch03

109

is complemented by a counter hp to handle the temporal decay of the refractory state. Following discrete time steps, each voxel p updates its state depending on its current state and the states of its neighborhood u(p) = {q ∈ C : d(p, q) ≤ r}, where d(p, q) is an Euclidean distance between voxels p and q; r ∈ N is a neighborhood radius. θ ∈ N is an excitation threshold and δ ∈ N is refractory delay. All voxels update their states in parallel and by the same rule ⎧ t t ⎪ ⎨ , if (p = ◦) and (σ(p) > θ) pt+1 •, if (pt = ) or ((pt = •) and (htp > 0)) ⎪ ⎩ ◦, otherwise, ⎧ t+1 = •) and (pt = ) ⎪ ⎨ δ, if (p = htp − 1, if (pt+1 = •) and (htp > 0) ht+1 p ⎪ ⎩ 0, otherwise. Every resting (◦) voxel of C excites () at the moment t + 1 if a number of its excited neighbors at the moment t, σ(p)t = |{q ∈ u(p) : q t = }|, exceeds a threshold θ. An excited voxel pt = takes the refractory state • at the next time step t + 1 and at the same moment a counter of refractory state hp is set to the refractory = htp − 1 at each iteration delay δ. The counter is decremented, ht+1 p until it becomes 0. When the counter hp becomes zero the voxel p returns to the resting state ◦. For all results shown in this manuscript, the neighborhood radius was set to r = 3. Choices of θ and δ are considered in Section 3.3.3. 3.3.2. Interfacing with the network To stimulate the network and to record activity of the network, we assigned several domains of C as electrodes. We calculated a potential ptx at an electrode location c ∈ C as pc = |z : d(c, z) < re and z t = |, where d(c, z) is an Euclidean distance between sites x and z in 3D space. We have chosen an electrode radius of re = 4 voxels and conducted two families of experiments with two configurations of electrodes.

page 109

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

A. Adamatzky et al.

110

Table 3.1. Coordinates of electrodes in experiments family E1 .

(a)

e

i

j

z

1 2 3 4 5 6 7 8 9 10

369 509 631 382 533 626 358 369 572 705

567 580 590 322 331 463 676 424 691 394

6 10 10 12 23 7 22 7 17 17

(b)

Figure 3.3. Configurations of electrodes in the 3D network of actin bundles used in (a) E1 and (b) E2 . Depth of the network is shown by level of grey. Sizes of the electrodes are shown in perspective.

In the first family of experiments E1 , we studied frequencies of two-inputs–one-output Boolean functions implementable in the network. We used ten electrodes, their coordinates are listed in Table 3.1 and a configuration is shown in Figure 3.3(a). Electrodes E0 representing input x and E9 representing input y are the input

page 110

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Discovering Boolean Functions on Actin Networks

b4205-v2-ch03

111

Table 3.2. Coordinates of electrodes in experiments family E2 . e 1 2 3 4 5 6 7 8

i

j

z

369 509 631 382 533 369 572 705

567 580 590 322 331 424 691 394

6 10 10 12 23 7 17 17

electrodes, all others are output electrodes representing outputs z1 , . . . , z8 . Results are presented in Section 3.3.3. In the second family of experiments E2 , we used six electrodes (Table 3.2 and Figure 3.3(b)). All electrodes were considered as inputs during stimulation and outputs during recording of the network activity. Exemplary snapshots of excitation dynamics on the network are shown in Figure 3.4. Domains corresponding to the two electrodes e0 and e9 (Table 3.1 and Figure 3.3(a)) have been excited (Figure 3.4(a)). The excitation wave fronts propagate away from e0 and e9 (Figure 3.4(b)). The fronts traverse the whole breadth of the network (Figure 3.4(c)). Due to the presence of circular conductive paths in the network, the repetitive patterns of activity emerge (Figure 3.4(d)). Recordings of potential and videos of experiments are available within the Zenodo repository.42 3.3.3. Maximizing a number of logical gates To design an actin droplet machine with complex behavior, we need to find values of refractory delay and excitation thresholds for which the actin bundle network executes a maximum of Boolean gates. To map dynamics of the network onto sets of gates, we undertook the following trials of stimulation: (1) fixed refractory delay δ = 20 and excitation threshold θ = 4, 5, . . . , 12,

page 111

August 3, 2021

112

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

A. Adamatzky et al.

Figure 3.4. Snapshots of excitation dynamics on the network. The excitation wave front is red and the refractory tail is magenta. The excitation threshold is θ = 7 and the refractory delay is δ = 20.

(2) fixed excitation threshold θ = 7, and refractory delay δ = 10, 15, 17, . . . , 24, 30. An example of the network spiking activity as a response to stimulation is shown in Figure 3.5. We stimulated the network with all possible configurations of inputs, recorded the network’s

page 112

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Discovering Boolean Functions on Actin Networks

b4205-v2-ch03

113

Figure 3.5. A scheme of a virtual experiment. The actin bundle network is shown as a three-dimensional Delaunay triangulation. Electrodes are shown by thick lines and labelled E1 to E5 . Exemplary trains of spikes are shown near the electrodes.

electrical dynamics and then extracted logical gates as follows. For each possible combination (i, j, k), 1 ≤ i, j, k ≤ 6, i = j, i = k, j = k, we considered electrodes Ei and Ej to be inputs, representing Boolean variables x and y, respectively, and electrode Ek as output electrode, representing results of a Boolean function. To input x =True, we applied a current to electrode Ei , to input y =True to electrode Ej . Then we recorded the potential at electrode Ek . Two-inputs–oneoutput logical functions were extracted from the spiking events as follows. Assume each spike represents logical True and that spikes being less than six iterations closer to each other happen at the same moment. Then a representation of gates by spikes and their combination will be as shown in Figure 3.6. For each combination (ρ, θ), we counted the numbers of gates or (x + y), and (xy), xor (x⊕), not-and (xy), and-not (xy) and

page 113

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

A. Adamatzky et al.

114 x+y

y

70

x y

x

40

50 60 Time, iterations

x+y

x+y

x+y

Potential, excited voxels

60 50 40 30 20 10 0 0

10

20

30

70

80

90

100

Figure 3.6. Representation of two-input–one-output Boolean gates by combinations of spikes. The black dotted line shows the potential at an output electrode when the network was stimulated by input pair (x, y) =(False, True), red solid by (True, False) and green dashed by (x, y) =(True, True).

select (x, y). We found that overall the total number of gates ν(θ) realized by the network decreases with increase of θ (Figure 3.7(a)). The function ν(θ) is nonlinear and could be adequately described by a five degree polynomial. The function reaches its maximal value at θ = 7 (Figure 3.7(a)). or gates are most commonly realized at θ = 11, and gates at θ = 6 and xor gates at θ = 5 as well as θ = 7 (Figure 3.7(b). A number of and-not gates implemented by the network reaches its highest value at θ = 6 then drops sharply after θ8 (Figure 3.7(c)). not-and gates are more common at θ = 5, 7, 9, 11, while select(x) has its peak at θ = 7 and select(y) at θ = 8, 9 (Figure 3.7(c)). A total number of gates realized in the network with the excitability threshold fixed to θ = 7 decreases with the increase of δ. Oscillations of ν(δ) are visible at 15 ≤ δ ≤ 25 (Figure 3.7(d)). The three highest values of ν(δ) are achieved at δ = 10, 17 and 20. Let us look now at the dependence of the numbers of or, and and xor gates of the refractory delay δ in Figure 3.7(e). The number of or gates increases with δ increasing from 10 to 15, but then drops substantially at δ = 18 to reach its maximum at δ = 19. Numbers of gates and and xor behave similarly to each other. They both have a pronounced peak at δ = 20 (Figure 3.7(e)).

page 114

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

Discovering Boolean Functions on Actin Networks 10

115

3

8

2 6

1

4 2

0 4

6

8

10

4

12

6

8

(a)

10

12

(b) 140

3

120

2 100

1 80

0 4

6

8

10

10

12

15

20

(c)

25

30

(d)

100 80 60 40 20 0 10

15

20

25

30

(e)

Figure 3.7. An average number ν of gates realizable on each of the electrodes e1 , . . . , e8 depends on threshold θ of excitation when the refractory delay δ is fixed to 20 (abc) and on refractory delays δ when the threshold θ is fixed to 7 (def). (a) Number of gates ν versus threshold θ, δ = 20. (b) Number of or (black circle), and (orange solid triangle) and xor (red blank triangle) gates, δ = 20. (c) Number of not-and (yellow blank triangle), and-not (magenta solid triangle), select(x) (cyan blank rhombus), select(y) (light blue disc), δ = 20. (d) Number of gates ν versus delay δ, θ = 7. (e) Number of or (black circle), and (orange solid triangle) and xor (red blank triangle) gates, θ = 7.

page 115

August 3, 2021

116

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

A. Adamatzky et al.

Figure 3.8. All spikes recorded at each electrode for input binary strings from 1 to 63. The representation is implemented as follows. We stimulate the M with strings from {0, 1}6 and represent a spike detected at time t by a black pixel at position t along horizontal axis. A plot of each electrode ei represents a binary matrix S = (szt ), where 1 ≤ z ≤ 63 and 1 ≤ t ≤ 1000: szt = 1 if the input configuration was z and a spike was detected at moment t, and szt = 1 otherwise.

The gate frequency analysis presented in this section allow us to choose θ = 7 and δ = 20 for an actin droplet machine constructed in the next section. 3.3.4. Actin droplet machine An actin droplet machine is defined as a tuple M = A, k, E, S, F , where A is an actin network automaton, defined in Section 3.3.1, k is a number of electrodes, E is a configuration of electrodes, S = {0, 1}k , F is a state-transition function F : S → S that implements a mapping between sets of all possible configurations of binary strings of length k. In the experiments reported here k = 6. In our experiments we have chosen six electrodes, their locations are shown in Figure 3.3(b) and exact coordinates in Table 3.2. Thus, F : {0, 1}6 → {0, 1}6 and the machine M has 64 states. We represent the inputs and the machine states in decimal encoding. Spikes detected in response to every input from {0, 1}6 are shown in Figure 3.8. Global transition graphs of M for selected inputs are shown in Figure 3.9. Nodes of the graphs are states of M, edges show transitions between the states. These directed graphs are defined

page 116

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Discovering Boolean Functions on Actin Networks

(a) I = 5

(b) I = 15

(c) I = 31

(d) I = 63

b4205-v2-ch03

117

Figure 3.9. State transitions of machine M for selected inputs I. A node is a decimal encoding of the M state (et0 . . . et5 ).

as follows. There is an edge from node a to node b if there is such 1 ≤ t ≤ 1000 that Mt = a and Mt+1 = b. Let us now define a weighted global transition graph G = Q, E, w, where Q is a set of nodes (isomorphic to the {0, 1}6 ), and E is a set of edges, and weighting function w : E → [0, 1] assigning a number of a unit interval to each edge. Let a, b ∈ Q and e(a, b) ∈ E then a normalized weight is calculated as w(e(a, b)) =

χ(s i∈Q,t∈T

d∈Q,t∈T

t =a

and st+1 =b) t =a and st+1 =d) , χ(s Q,t∈T

with χ takes value “1” when the

page 117

August 3, 2021

118

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

A. Adamatzky et al.

conditions are true and ‘0’ otherwise. In words, w(e(a, b)) is a number of transitions from a to b observed in the evolution of M for all possible inputs from Q during time interval T normalized by a total number of transition from a to all other nodes. The graph G is visualized in Figure 3.10(a). Nodes which have predecessors are 1–6, 8–10, 12, 16–21, 24, 25, 28, 32–34, 36–38, 40, 41, 44, 48–50, 52, 53, 56. Nodes without predecessors are 7, 11, 13–15, 22, 23, 26, 27, 29–31, 35, 39, 42, 43, 45–47, 51, 54, 55, 57–63. Let us convert G to an a cyclic non-weighted graph of more likely transitions G ∗ Q, E∗ , where e(a, b) ∈ E∗ if w(e(a, b)) = max{w(e(a, c))|e(a, c) ∈ E}. That is for each node we select an outgoing edge with maximum weight. The graph is a tree, see Figure 3.10(b). Most states apart of 1, 2, 4, 8, 16, 20, 32 are Gardenof-Eden configurations, which have no predecessors. In degrees ν() of not-Garden-of-Eden nodes are ν(20) = 1, ν(32) = 2, ν(2) = 3, ν(4) = 4, ν(1) = 5, ν(16) = 6, ν(8) = 12. There is one fixed point, the state 1, corresponding to the situation when a spike is recorded only on electrode e5 ; it has no successors. By analyzing G, we can characterize a richness of M’s responses to input stimuli. We define a richness as a number of different states over all inputs, as shown in Table 3.3, and distribution in Figure 3.12(a). A number of produced states increases from under five for beginning of M evolution and then reaches circa seven states in average. Oscillations around this value are seen in (Figure 3.12(a)). Figure 3.12(b) shows a number of different nodes, generated in evolution of M, stimulated by a given input. There is below 15 different states found in the evolution in responses to inputs 1 to 21 (21 corresponds to binary input string 010101); then a number of different nodes stay around 25. The diagram in Figure 3.12(c) shows how many inputs might lead to a given state/node of M. Some of the states/nodes are seen to be Garden-of-Eden configurations E (nodes without predecessors) and thus could not be generated by stimulating M by sequences from Q − E. Assume T is a set of temporal moments when the machine responded at least to one input string with a non-zero state. Configurations at each transition t can be considered as outputs

page 118

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

Discovering Boolean Functions on Actin Networks 44

page 119

119

28 37

12

6

21

38

5 25 32 24

49

48

1 16

56

36 8

9

19

2

17

52 20

4

18 10

34

40 41

33

3

50 53

(a) 19 25 21 40

41

37

32

9

5

2

38

3

49

12 16

53 6

20

8 36 48 10

33 52

56

4

17 1

24 34

18 50 44

28

(b)

Figure 3.10. (a) Global graph of M state transitions. Edge weights are visualized by colours: from lowest weight in orange to highest weight in blue. (b) Pruned global graph of M: only transitions with maximum weight for any given predecessors are shown, each node/state has at most one outgoing edge.

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

A. Adamatzky et al.

120 30 27

29

28 31 18

3

33

19

5

13

4

36

12

6

35

37 32

53

9

62

47

59

60 61 50

16

56

48

22

41 2 1

58

46 20

54 55

45

52 51

23

57 63 49

34 24

7 43 42

44 21

8 40 17

38

10 25

11 26 14 15

Figure 3.11.

39

Graph of g at t = 41.

representing the function g : 0, 16 → 0, 16 . As we can see in Table 3.3, transitions at t = 41 and t = 53 correspond to the highest number of different binary strings (e1 , . . . , e6 ). The graph corresponding to g(41) at t = 41 is shown in Figure 3.11 and is not connected. The small component consists of fixed point 40 (string “101000”) with two leafs 39 (“100111”) and 38 (“100110”). The largest component has a tree structure at large, with cycle 2 (“000010”) – 1 (“000001”) as a root. Other nodes with most predecessors are 8 (“001000”), 16 (“010000”), and 18 (“010010”). From the transitions g(41), we can reconstruct Boolean functions realized at each of six electrodes (the functions are minimized and represented in a disjunctive normal form): e0 : f0 (x0 , . . . , x5 ) = x0 · x1 · x2 · x3 + x0 · x1 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 + x0 · x2 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 + x1 · x2 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 · x5 e1 : f1 (x0 , . . . , x5 ) = x0 · x1 · x2 · x3 + x0 · x1 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 + x0 · x2 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 + x1 · x2 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 · x5

page 120

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Discovering Boolean Functions on Actin Networks Table 3.3. Fifty four state transitions of M over all possible inputs: t is a transition step, μ(t) is a number of different states appeared over all possible inputs, P(t) is a set of nodes appeared at t. t 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

μ(t)

P(t)

3 3 3 3 3 3 4 4 5 4 5 4 6 8 8 6 5 7 6 8 9 6 7 6 7 6 6 6 7 7 9 7 9 7 6 6 7

8, 9, 1, 16, 32, 8, 1, 16, 32, 8, 1, 16, 1, 8, 16, 16, 8, 1, 8, 1, 16, 4, 1, 16, 8, 5, 16, 1, 8, 4, 5, 16, 1, 8, 4, 8, 1, 16, 20, 4, 1, 16, 8, 20, 16, 8, 1, 17, 4, 20, 8, 16, 17, 4, 20, 1, 32, 2, 1, 16, 8, 4, 2, 10, 20, 32, 16, 4, 8, 1, 10, 32, 16, 1, 4, 8, 9, 8, 16, 4, 1, 17, 10, 9, 1, 8, 16, 17, 4, 10, 16, 1, 8, 17, 4, 24, 10, 2, 8, 16, 1, 17, 32, 24, 9, 4, 10, 16, 1, 8, 32, 9, 4, 8, 1, 16, 4, 32, 9, 17, 1, 16, 17, 4, 32, 8, 16, 1, 8, 4, 17, 32, 9, 8, 16, 4, 12, 1, 17, 1, 8, 16, 4, 17, 32, 16, 8, 4, 1, 24, 32, 8, 1, 4, 16, 12, 24, 32, 16, 1, 8, 4, 17, 2, 32, 8, 1, 24, 16, 12, 4, 2, 17, 32, 1, 16, 8, 24, 17, 2, 40, 16, 8, 1, 4, 40, 17, 24, 32, 2, 8, 1, 16, 24, 40, 4, 32, 1, 16, 8, 4, 24, 2, 16, 8, 1, 17, 4, 32, 8, 16, 17, 4, 1, 40, 2,

b4205-v2-ch03

121

page 121

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

A. Adamatzky et al.

122

Table 3.3. t

μ(t)

38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

7 7 7 10 8 9 9 7 7 5 5 8 6 8 9 10 9

(Continued) P(t)

1, 8, 16, 16, 1, 8, 8, 16, 4, 1, 8, 16, 16, 1, 8, 8, 1, 16, 1, 16, 8, 16, 8, 4, 8, 1, 16, 1, 16, 8, 16, 8, 1, 8, 1, 16, 1, 16, 8, 16, 8, 1, 8, 16, 4, 1, 8, 16, 16, 1, 8,

17, 4, 17, 9, 1, 24, 9, 17, 4, 18, 4, 24, 4, 17, 1, 12, 4, 24, 4, 33, 4, 17, 4, 20, 4, 17, 4, 17, 1, 32, 4, 20, 5, 17,

24, 2, 4, 2, 40, 2, 4, 18, 24, 40, 2, 33, 40, 24, 33, 18, 32, 34, 33, 24, 32, 40, 24, 34, 18, 34,

32, 24, 19, 32, 32, 41, 19, 33, 41, 2, 19, 10, 2, 41, 32, 19, 4, 2, 32, 19,

30

10 20 Nodes

Number of different nodes (t)

b4205-v2-ch03

5

10

0 10

20

30 Transition step t

40

0

50

0

10

20

30 Inputs

(a)

40

50

60

(b)

Inputs

60

40

20

0 0

10

20

30 Nodes

40

50

60

(c)

Figure 3.12. Distributions characterizing richness of M’s responses. (a) Different states per transitions over all inputs. The horizontal axis shows steps of M transitions. Vertical axis is a number of different states. (b) Nodes per input. Horizontal axis shows decimal values of input strings. The horizontal axis shows a number of different states/nodes generated in the evolution of M. (c) Inputs per node.

page 122

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Discovering Boolean Functions on Actin Networks

b4205-v2-ch03

123

e2 : f2 (x0 , . . . , x5 ) = x0 · x1 · x2 · x3 + x0 · x1 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 + x0 · x2 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 + x1 · x2 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 · x5 e3 : f3 (x0 , . . . , x5 ) = x0 · x1 · x2 · x3 + x0 · x1 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 + x0 · x2 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 + x1 · x2 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 · x5 e4 : f4 (x0 , . . . , x5 ) = x0 · x1 · x2 · x3 + x0 · x1 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 + x0 · x2 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 + x1 · x2 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 · x5 e5 : f5 (x0 , . . . , x5 ) = x0 · x1 · x2 · x3 + x0 · x1 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 + x0 · x2 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 + x1 · x2 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 · x5 + x0 · x1 · x2 · x3 · x4 · x5 3.4. Voltage-based Gates 3.4.1. The model A detailed description of key models that we used as foundation in the study by Siccardi et al.,43 can be found in Ref. [19], which was aiming at a description of actin filaments and contains the derivation of all the formulas. Let us highlight the assumptions on which the model was build. Each monomer in the filament has 11 negative excess charges. The double helical structure of the filament provides regions of uneven charge distribution such that pockets of higher and lower charge density exist. There is a well-defined distance, the so-called Bjerrum length λB , beyond which thermal fluctuations are stronger than the electrostatic attraction or repulsion between charges in solution. It is inversely proportional to temperature and directly proportional to the ions’ valence z: λB =

ze2 , 4π0 kB T

(3.1)

where e is the electrical charge, 0 the permittivity of the vacuum, the dielectric constant of the solution with actin filaments immersed in (estimated to be similar to water ≈ 80), kB the Boltzmann’s constant and T the absolute temperature. If δ is the mean distance between charges, counterion condensation is expected when

page 123

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

A. Adamatzky et al.

124

λB /δ > 1. Considering the temperature is T = 293 K and the ions are monovalent,19 finds λB = 7.13 × 10−10 m and44 λB = 13.8 × 10−10 m for Ca2+ at T = 310 K. Considering actin filaments, δ is estimated to be 0.25 nm because assuming an average of 370 monomers per μm there are ca. 4e/nm. Each monomer behaves like an electrical circuit with inductive, capacitive, and resistive components. The model is based on the transmission line analogy. The capacitance C is computed considering the charges contained in the space between two concentric cylinders, the inner with radius half the width of a monomer (ractin = 2.5 nm) and the outer with radius ractin + λB ; both cylinders are one monomer high (5.4 nm). Thus, C0 =

2πl +λB ln( ractin ractin )

,

(3.2)

where l ≈ 5.4 nm is the length of a monomer. The charge on this capacitor is assumed to vary in a nonlinear way with voltage according to the formula Qn = C0 (Vn − bVn2 ).

(3.3)

Nonlinear voltage dependence of electrochemical capacitance for nanoscale conductors is caused by the finite density of states of the conductors. Details can be found in Refs. [45, 46]. We did not try to evaluate this parameter; instead, we used some trial values in our equations and found that, as long as b is reasonably small, the solutions converge to the constant ones in the cases that we considered. So, we focused on constant solutions, and we can conclude that nonlinearity is not needed for our results. The inductance L is computed as L=

μN 2 π(ractin + λB )2 , l

(3.4)

where μ is the magnetic permeability of water and N is the number of turns of the coil, which is the number of windings of the distribution of ions around the filament. It is approximated by counting how many

page 124

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Discovering Boolean Functions on Actin Networks

b4205-v2-ch03

125

ions can be lined up along the length of a monomer as N = l/rh , and it is assumed that the size of a typical ion is rh ≈ 3.6 × 10−10 m. The resistance R is estimated considering the current between the two concentric cylinders obtaining: R=

ρ ln((ractin + λB )/ractin ) , 2πl

(3.5)

where resistivity ρ is approximately given by ρ=

+ ΛK 0 cK +

1 . a+ c + ΛN N a+ 0

(3.6)

Here, cK + and cN a+ are the concentrations of sodium and potassium ions, which were considered in previous papers to be 0.15 M + a+ ≈ ≈ 7.4(Ωm)−1 M −1 and ΛN and 0.02 M, respectively; ΛK 0 0 5.0(Ωm)−1 M −1 are positive constants that depend only on the type of salts but not on the concentration.19 With this formula R1 is computed and R2 is taken as ˜1/7R1 . Here, R1 accounts for viscosity. Figure 3.13 illustrates the circuit schema, where an actin monomer unit in a filament is delimited by the dotted lines.

Figure 3.13. Ref. [19].

A circuit diagram for the nth unit of an actin filament. Taken from

page 125

August 3, 2021

126

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

A. Adamatzky et al.

The main equation for filaments is the following, derived from Ref. [19] (see Figure 3.13 for the meaning of R1 etc.). LC0

d2 d (Vn − bVn2 ) = Vn+1 + Vn−1 − 2Vn − R1 C0 (Vn − bVn2 ) 2 dt dt d d 2 ) − R2 C0 2 (Vn − bVn2 ) − (Vn+1 − bVn+1 dt dt d 2 (3.7) − (Vn−1 − bVn−1 ) . dt

In Ref. [32], we used this equation to compute the evolution of some tens of monomers in a filament. It must be observed that the Bjerrum length will probably not be constant, but may vary both from point to point and with time. Also, one could consider the effects described in Ref. [47] leading to charge density waves. However, the effects reported in that work refer mainly to electrostatically condensed bundles, while the bundles and their networks in our experimental setting were formed via depletion forces. The depletion forces are a fundamental, entropic effect and do not rely on counterion condensation.48, 49 Thus, the situations, that is, the charge distributions, are completely different. Moreover, the present work is a computational analysis to prepare real experiments and to speculate about potential solutions, and will use the simplest possible stimuli, that is, constant ones. Therefore, we did not consider phenomena that, even if they happened, might be considered transient in this context. 3.4.2. Extension to bundle networks In order to extend the model to bundle networks, we must compute the suitable electrical parameters. The actin filaments are made of elements, the actin monomers. We will model bundles as made of elements of the same height of a single monomer, and width depending on the bundle density. We will consider two possibilities: (1) The filament density in the bundle is so low that each filament stands at a distance greater than twice λB from all the others.

page 126

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

Discovering Boolean Functions on Actin Networks

127

In this situation, we assume that filaments do not interact and that each one behaves as if it would not be in the bundle. (2) The inner-bundle density is high enough that areas closer than λB to the filaments intersect. In this situation, we will conservatively assume that the influences of the filaments’ ions cancel out. In case 1, we can either consider the parameters for a filament and solely multiply results by the number of filaments in the bundle or compute C, L and R using the standard formulas for electrical parallel circuits. In case 2, we only use the bundle radius instead of the filament one in the above formulas. Considering a Bjerrum length λB = 7.13 × 10−10 m19 leads to results for high density bundles at different bundle widths, which are displayed in Table 3.4. Results for low density bundles made of varying filament numbers are shown in Table 3.5. In the following, we will define equations for nodes. Equation (3.7) applies to elements inside the bundle, so we will use Equation (3.8) instead, where n is the index of the element, M is the number of elements linked to it, and the suffix nk ranges in the set of such linked elements.

Table 3.4.

C0 , L and R1 for high density bundles.

Width C0 in pF L in pH R1 in M Ω

Table 3.5.

200 nm

450 nm

700 nm

33.8 · 10−4 1668 0.173

76 · 10−4 8378 0.077

11.8 · 10−3 20227 0.049

C0 , L and R1 for low density bundles.

Filaments

1

25

50

75

C0 in pF L in pH R1 in M Ω

102.6 · 10−6 1.92 5.7

4.1 · 10−6 7.66 · 10−2 0.23

2 · 10−6 3.83 · 10−2 0.11

1.4 · 10−6 2.56 · 10−2 0.08

page 127

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

A. Adamatzky et al.

128

The term Fn represents an input voltage, which is supposed to be non-zero only for some values of n. d2 (Vn − bVn2 ) dt2 M d 1 × Vnk − M × Vn + Fn − R1 C0 (Vn − bVn2 ) = LC0 dt k=1

M d d (Vnk − bVn2k ) . M × (Vn − bVn2 ) − dt dt

− R2 C 0

k=1

(3.8) These equations can represent any type of element in the network. When M = 2, they coincide with (3.7) and represent internal elements of a bundle. When M = 1, they refer to a free terminal element of a bundle that is not connected to anything else. We note that in Ref. [32] we used a slightly different equation for this case, namely we always kept M = 2. The present form is more consistent with the model and its generalization. Other values of M represent generic nodes. 3.4.3. The network We used a stack of low-dimensional images of the 3D actin network, produced in experiments on the formation of regularly spaced bundle networks from homogeneous filament solutions.41 The network was chosen because it resulted from a protocol that reliably produces regularly spaced networks due to self-assembly effects40, 41 and thus could be used in prototyping of cytoskeleton computers. From the stack of images we extracted a network description, in terms of edges and nodes, and used it as a substrate to compute the electrical behavior. The extracted structure takes into account the main bundles in each image, with their intersections, and an estimate of bundles that can connect nodes in two adjacent images. It is not

page 128

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Discovering Boolean Functions on Actin Networks

b4205-v2-ch03

129

an accurate portrait of all the bundles, but it captures the main characteristics of the network. The main steps to compute the network structure were as follows:

(1) After some preprocessing of the images (e.g., thresholding, contour finding, distance transform, etc.), we looked at the points placed at the local maxima of distance from background. We considered that these are the nodes of the network. Each node found in this way has a center and a radius (corresponding to the circle that can be inscribed in the foreground). (2) We then tried to link nodes to each other with straight lines or elliptical arcs, checking that they do not go out of the bundles (with some tolerance, as the bundles are often bended). For this, starting from, node 1, we considered the point spaced about 16 pixels along the line from node 1 to, for example, node 2. If its color was above the threshold, we went on to the next point 16 pixel farther. If not, we considered the points in a neighborhood 4 pixel wide: if at least one was above the threshold, we considered that the edge is still in the bundle and went on, if not we stopped. (3) If we could not find any straight line, we tried some elliptical arcs, with big axis = distance between nodes and a range of small axes, using the same procedure. (4) When we were able to reach e.g. node 2 from node 1, we added the edge to the network, with its length (distance of the linked nodes for straight edges or approximate ellipse arc length for the others) and width = the average of the radii of its nodes; when we were not able to reach node 2 we did not add the edge; if a node was not connected to anything we did not consider it anymore. (5) We also tried to detect edges between images. For this, we merged the bundles of two consecutive ones, shrunk them a bit, and applied the method of step 2 to link nodes. We used a lower tolerance in this step and considered straight lines only.

page 129

August 3, 2021

130

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

A. Adamatzky et al.

All computations described here and in the other sections were performed using Python including its libraries Matplotlib50 and Scipy.51 At the end, we got a table of pictures — nodes; for each node we got its radius as well as position and the list of nodes that we were able to reach starting from there; and another table with edges and their characteristics. Based on this, we derived the data in Table 3.6. These figures can be compared to typical characteristics found in experiments.41 The bundles are formed by depletion forces and neighboring filaments will maximize their overlap region; the bundles will be at least as long as the longest filaments, experimentally it is even hard to form super long bundles. A typical length distribution of actin filaments has mean value of about 9–10 μm, which is already rather long, and a range between 10 and 50 μm for bundles. In our model, the length distribution is quite skewed, with 80% edges lengths in the range 10–30 μm and another 7% in the range 30–50 μm. About bundle sizes, 84% of radii in the model are in the range of 1.22–3.42 μm. This can be considered a reasonable estimation and compares well to the experiments.41 The average number of filaments in bundles is quite difficult to measure experimentally. An estimation based on comparing the fluorescence intensity of a bundle against a single filament yielded a result of 45 (+/−25) filaments per bundle.52 Only qualitative differences can be shown when successively increasing the number of filaments within on bundle.53 Considering even small bundles having 100 nm radius, we find that they can accommodate more than 100 filaments, considering a radius of 5.4 nm plus 2λB . For this reason, the low density model for bundles is probably appropriate, and we decided to not consider possible interactions between actin filaments within the bundles. As an illustration, in Figure 3.14 the white lines drawn on the original image represent the computed edges. In a realistic experiment, one would have to set a support with electrodes in contact with the network. In a typical configuration, we

page 130

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

Discovering Boolean Functions on Actin Networks

Figure 3.14.

131

Two-dimensional centers and edges of a Z-slice.

will consider a grid of 5 × 6 electrodes, for instance, on a thin glass; their diameter is 10 μm and center-to-center distances are 30 μm. We considered two situations: (1) the network is grown in droplets sitting on the glass surface which holds the electrodes. This is actually very close to the experimental setup described in Refs. [41, 54]. And (2) the array of electrodes is set inside the network, that is, in the middle of the actin droplet along its vertical axis. This might be the case if the networks were grown around the electrode layer or if it were placed in the network later. Figure 3.15 is an illustration of the network grown on top of the electrode-containing glass. The general features of the network are listed in Table 3.6.

page 131

August 3, 2021

132

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

A. Adamatzky et al.

Figure 3.15. The grid of electrodes as it appears when the network formed on top of the supporting glass.

3.4.4. Preliminary results We used the simplest possible form for the input functions Fn , which are constant functions: ⎧ ⎪ 1 ⎪ ⎨ for 0 < t < t1 Fn ≡ 0 ⎪ ⎪ ⎩ −1

if n ∈ N1 if n ∈ N0

(3.9)

if n ∈ N−1 ,

where N1 , N0 and N−1 are three sets of indices and t1 the duration of the input stimuli, which can be equal to or less than the whole experiment time.

page 132

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

Discovering Boolean Functions on Actin Networks

133

Table 3.6. Parameters of the actin network used in the modelling. Parameter Number of nodes in the main connected graph Number of edges in the main connected graph Max number of nodes linked to a node Average number of nodes linked to a node Standard deviation of nodes linked to a node Average radius of edges in pixels Max radius of edge in pixels Min radius of edge in pixels Standard deviation of radii of edges in pixels Average edge radius, one pixel is 244.14 nm Average length of edges in pixels Max length of edge in pixels Min length of edge in pixels Standard deviation of lengths of edges in pixels Average edge length, one pixel is 244.14 nm

Value 2968 7583 13 5.07 2.14 8.48 20 3 2.62 2.07 μm 70.11 465.40 4.12 41.54 17.12 μm

Numerical integration has been performed for bundles consisting of some tens of elements using various stimuli and electrical values. We considered both open bundles with free extremal elements and closed ones, where every element is connected to two others. Examples can be found in Figure 3.16 (high density open bundle) and Figure 3.17 (low density closed bundle). These numerical experiments demonstrate that in all the cases considered, the solutions become constant after a transient time. Moreover, when the inputs are blocked all the solution converge to the same constant value, so that no currents can be detected. We therefore considered constant stimuli lasting for all the experiment time and searched for constant solutions. Input bits are defined as a pair of points of the network, so that a +1 potential (in arbitrary units) is applied at one of them and −1 at the other to encode a value 1 of the bit; when no potential is applied, the bit value is zero.

page 133

August 3, 2021

134

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

A. Adamatzky et al.

Figure 3.16. Evolution of the potential of first, middle and last elements of an open high density (H.D.) bundle 32 elements long, 450 nm thick. Input was set at 1 and −1 at the first and last elements.

Figure 3.17. Evolution of the potential of some elements of a closed low density (LD) bundle 32 elements long, made of 50 filaments. Input was set at 1 and −1 at elements with index 0 and 15.

page 134

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Discovering Boolean Functions on Actin Networks

b4205-v2-ch03

135

Analogously, we chose pairs of points and measured the difference of their potential to read an output bit. A suitable threshold has been defined to distinguish the 1 and 0 values. 3.4.5. Results 3.4.5.1. Ideal electrodes In this section, we consider ideal electrodes that (1) can be placed in any point on the network surface or inside it and (2) are so small that they would be in contact with one element of a single bundle only. Moreover, we use a slightly idealized network of spherical shape. 3.4.5.2. Boolean gates We have randomly chosen eight sites in the network and considered them as four pairs to represent four input bits. Then we applied in turn all the possible input states from (0000) to (1111) and solved the system (3.8). It reduces to a linear algebraic structure and simplifies finding the values of the potential in the nodes. Then we checked, for all the sets of input states that correspond to a logical input, which output bits correspond to the expected results for a gate. For instance, to find the not gates we considered input state sets ((0000),(0001)), ((1000),(1001)) etc.; then we looked for all the output bits that are 1 for the first state and 0 for the second of one of the input sets. The same procedure was used to find or, and and xor gates. We used three values for the output threshold: 2, 1, and 0.5. The results of three runs are shown in Tables 3.7–3.10 revealing that once an input position is chosen, it is possible to find a suitable number of edges that behave as output for the main gate types. Time estimates: We also computed the time that the network would need to converge to the constant solutions taking the time into account needed for an element to discharge. As a first estimate, we used the value R1 C0 , that is the discharge time of a pure RC circuit. Using the parameters for a single filament (or for low density

page 135

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

A. Adamatzky et al.

136 Table 3.7. Run 1 2 3

Thresh. 2

Thresh. 1

Thresh. 0.5

8266 3688 5730

8944 4660 7043

8409 4682 7455

Table 3.8. Run 1 2 3

1 2 3

Thresh. 1

Thresh. 0.5

4385 6360 5835

8191 8188 8260

12494 11336 11063

1 2 3

Number of possible and gates.

Thresh. 2

Thresh. 1

Thresh. 0.5

3600 4506 4954

3562 43842 5076

2577 3119 3726

Table 3.10. Run

Number of possible or gates.

Thresh. 2

Table 3.9. Run

Number of possible not gates.

Number of possible xor gates.

Thresh. 2

Thresh. 1

Thresh. 0.5

1543 584 1009

2155 986 1499

3749 1799 3083

bundles made of independent filaments), we got 2.248·10−3 s to travel the 3,843,876 elements of the whole network. Parameters for a high density network, adjusted for the estimated width of each bundle, gave a time of 2.25 · 10−3 s. In both cases, the velocity is of the order of 4.7 m/s, two orders of magnitude larger than the estimate found in Ref. [44] with a different model (pure RC), but in the range estimated in Ref. [19] using the presented method.

page 136

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Discovering Boolean Functions on Actin Networks

b4205-v2-ch03

137

3.4.5.3. Realistic electrodes In this section we consider electrodes that are actually available, with their supporting glass. Moreover, we use the real network dimensions (the confocal images are 250 μm×250 μm and they are spaced 110 μm in depth). We considered both the case with the network being on top of the glass holding the electrodes, and the case when the electrodes are placed inside the network along the middle plane of the confocal image stack. Boolean gates: In the case of the network on top of the glass, we have randomly chosen eight electrodes and considered them as four pairs to represent four input bits. We applied in turn all the possible input states from (0000) to (1111) and solved the system (3.8). Then we computed the potential differences for all the pairs of electrodes that were not used as input and applied a suitable threshold to distinguish 0 and 1 bits. The threshold we used was the median of the differences. As 10 electrodes out of the 18 connected to the network were not used as input, we had 45 potential output bits. We found that, considering all the possible input and output bits, we have 101 not gates, 113 or gates, 46 and gates and 13 xor gates. It must be noted that the same pair of output electrodes may have been counted many times in these numbers. For instance, the potential difference of electrodes between 46th and 32nd electrode (electrodes in row 4 column 6 and in row 3 column 2), were considered a possible not gate for all the cases listed in Table 3.11. In the case of the network with electrodes placed in the interior of the network, we have randomly chosen 12 electrodes and considered them as six couples to represent six input bits. We applied in turn all the possible input states from (000000) to (111111) and solved the system (3.8). Then we computed the potential differences for all the pairs of electrodes that were not used as input and applied a suitable threshold to distinguish 0 and 1 bits. The threshold we used was the median of the differences.

page 137

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

A. Adamatzky et al.

138 Table 3.11. Input state

Possible not with a single edge. Output value

not on bit

1100 1101

1 0

4

1010 1011

1 0

4

1000 1001

1 0

4

0110 0111

1 0

4

0100 0101

1 0

4

0111 0011

0 1

2

0101 0001

0 1

2

1011 0011

0 1

1

1001 0001

0 1

1

As 15 electrodes out of the 27 connected to the network were not used as input, we had 105 potential output bits. We found that, considering all the possible input and output bits, we have 1885 not gates, 1279 or gates, 783 and gates and 467 xor gates. 3.4.6. Finite state machine The actin network implements a mapping from {0, 1}k to {0, 1}k , where k is a number of input bits represented by potential difference in pairs of electrodes, as described above. Thus, the network can be considered as an automaton or a finite state machine, Ak = {0, 1}, C, k, f . The behavior of the automaton is governed by the function f : {0, 1}k → {0, 1}k , k ∈ Z+ . The structure of the mapping f is determined by exact configuration of electrodes C ∈ R3 and geometry of the AF bundle network.

page 138

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Discovering Boolean Functions on Actin Networks

b4205-v2-ch03

139

3.4.6.1. Using two values of k = 4 and k = 6 The machine A4 represents the actin network placed onto an array of electrodes. In this case, having at our disposal 45 potential output bits, the number of combinations of 4 of them is 148,995. We therefore limited the study at the output positions that assume a 1 value more than 6 and less than 11 times for the 16 input states. In this way we found 11 output bits and computed the state transitions for the 330 machines that one can obtain choosing 4 out of them, k = 4. The machine A6 represents the actin network where the array of electrodes is inside the network. In this case, having at our disposal 105 potential output bits, the number of combinations of 6 of them is quite large. We therefore limited the study at the output positions that assume a 1 value 32 times for the 64 input states. In this way we found again 11 output bits and computed the state transitions for the 462 machines that one can obtain choosing 6 out of them, k = 6. We derived structures of functions f4 and f6 , governing behavior of automata A4 and A6 , as follows. There is potentially an infinite number of electrode configurations from R3 . Therefore, we selected 330 and 462 configurations C for machines A4 and A6 , respectively, and calculated the frequencies of connections of input to output states, obtaining two probabilistic state machines = {0, 1}, p, k, f , {0,1} → [0, 1], the p assigns a probability to each where p : {0, 1}k mapping from {0, 1}k to {0, 1}. Thus, a state transition of Ak is a directed weight graph, where weight represents a probability of the transition between states of Ak corresponding to nodes of the graph. The weighted graph can be converted to a non-weighted directed graph by removing all edges with weight less than a given threshold θ. In the following, we perform trimming for several thresholds with 0.1 increment. The graph remains connected for θ till 0.1 (Figure 3.18). The graph for A4 is characterizing for having no unreachable nodes and several absorbing states (Figure 3.18(a)) while the graph for A6 has a number of unreachable nodes (Garden-of-Eden states) and less, than A4 , absorbing states (Figure 3.18(b)).

page 139

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

page 140

A. Adamatzky et al.

140

29 36 3

22

20

53 8

37

2

17

9

1 0

6

2 32

16

7

33

9

48 5

21

49

0

11

52

50 18

5 11

14

13

25

4

34 3 12

10

41 27

30 19

1

13

35 7

28

15 3860

15

44 31

47 55 8

57

54

39

4

40

51 56

6

63 62 46

23

45

61

12 10

43 59

26 58

14

(a)

(b)

Figure 3.18. State transitions graphs for (a) A4 and (b) A6 , trimming threshold is θ = 0.1. Nodes are labelled by digital representation of 4-bit (a) and 6-bit (b) states.

The state transition graph of A6 becomes disconnected for θ = 0.2 (Figure 3.19(b)) and the graph of A4 remains connected (Figure 3.19(a)). Another way of converting weighted, probabilistic, state transition graphs into non-weighted graphs is by selecting for each node x a successor y such that the weight of the arc (xy) is the highest among all arcs outgoing from x. These graphs G4 and G6 of most likely transitions are shown in Figure 3.20. The graph G4 (Figure 3.20(a)) has two disconnected sub-graphs, 8 Garden-of-Eden states and two absorbing states corresponding to (1111) and (0000); the graph has no cycles. The graph G6 has five disconnected subgraphs (Figure 3.20(a)). Two of them have only absorbing states,

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

Discovering Boolean Functions on Actin Networks

page 141

141

47 59 41 4

63

0

62 8

5

15

45

43

1

28

39 31

49

33

7 12

14

60

38

55

2

9

10 15

35 51

25

6

48

11 50 13

4

56 1

7 34 0 16 17

2

(a)

(b)

Figure 3.19. State transitions graphs for (a) A4 and (b) A6 , trimming threshold is θ = 0.2. Nodes are labeled by digital representation of 4-bit (a) and 6-bit (b) states, respectively.

corresponding to (00000) and (101010), and no cycles. Three of the sub-graphs do not have an absorbing state but have cycles: (111110) → (111111) → (111110), (001111) → (001111) → (100110) → (111100) → (001111) and (000011) → (110001) → (001001) → (000010) → (100001) → (001100) → (000011). 3.5. Discussion Early concepts of sub-cellular computing on cytoskeleton networks as microtubule automata8–10 and information processing in actintubulin networks11 did not specify what type of “computation” or “information processing” the cytoskeleton networks could execute and how exactly they do this. We implemented several concrete implementations of logical gates and functions on a single actin filament32 and on an intersection of several actin filaments33 via

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

page 142

A. Adamatzky et al.

142

18

42

19 49

3

9

58

52 3

48 50 4

29

12

9

13

6 36 22

25

23

32

2

33

8

53

59

1

37

8

7

2

46 24

7

5

26

30

62

47

4 14

28

13 63

21

43

5

10

11 20

0

15

14 51

54

0

27

56

16 38

55

12 11 6

40

34

1

17

41

57

44

60

31

15 39

61 35

45 10

(a)

(b)

Figure 3.20. Graphs representing most likely transitions G4 (a) and G6 (b) of A4 (a) and A6 (b).

collisions between solitons. We also used a reservoir-computing-like approach to discover functions on a single actin unit34 and filament.35 Later, we realized that it might be unrealistic to expect someone to initiate and record a travelling localizations (solitons, impulses) on a single actin filament. Therefore, we developed a numerical model of spikes propagating on a network of actin filament bundles and demonstrated that such a network can implement Boolean gates.36 In present chapter, we reconsidered the whole idea of the information processing on actin networks and designed an actin droplet machine. The machine is a model of a 3D network, based on an experimental network developed in a droplet, which executes mapping F of a space of binary strings of length k on itself. The machine acts as a finite state machine, which behavior at a low level is governed by localizations travelling along the networks and

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Discovering Boolean Functions on Actin Networks

b4205-v2-ch03

143

interacting with each other. By focusing on a single element of a string, that is, a single location of an electrode, we can reconstruct k functions with k arguments, as we have exemplified at the end of Section 3.3.4. Exact structure of each k-ary function is determined by F , which, in turn, is determined by the exact architecture of a 3D actin network and a configuration of electrodes. Thus, potential future directions could be in detailed analysis of possible architectures of actin networks developed in laboratory experiments and evaluation on how far an exact configuration of electrodes affects a structure of mapping F and corresponding distribution of functions implementable by the actin droplet machine. The ultimate goal would be to implement actin droplet machines in laboratory experiments and to cascade several machines into a multiprocessors computing architecture. Conventional hardware is static. Actin networks reconfigure dynamically: some filaments disappear by depolymerization, new filaments appear by polymerization. This is not a disadvantage of the actin network computers because: (1) they operate with the speed several order more than actin treadmilling rate, (2) actin networks can be stabilized, (3) we can employ dynamic reconfigurablity in the computation. A computation in actin bundle networks is implemented with travelling mechanical or electrical signals. Thus, we could estimate a speed of the signals propagation would be 106 μm/s for sound solitons or 105 –108 μm/s for action potential speed. Let us take the lowest estimate 105 μm/s. Assuming a maximum linear size of an actin droplet machine is ca. 250 μm, the machine can process about 400 parallel inputs per second, thus operating at 0.4 kHz frequency. Commonly, actin polymerization speed is estimated to be 4 · 10−1 μm/s. An acting bundle has up to 500 actin filaments, which will not fail simultaneously. In fact, we have seen that the networks, once formed, could remain stable over hours or even days without major rearrangements. In contrast to cells, no actin accessory proteins were used and no energy in form of ATP was provided. The structures self-assembled solely driven by thermodynamic arguments into a stable, frozen state. If we would neglect these experimental

page 143

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

144

b4205-v2-ch03

A. Adamatzky et al.

findings and would assume an active networks with high treadmilling rates, we can consider the network being fixed for at least 10 s, which allows us to execute up to 4 · 103 cycles of computation. The lifetime of the fixed network can be even substantially changed by using accessory proteins such as purely synthetic actin crosslinkers from DNA and peptides,29 or natural crosslinkers such as α-actinin, increasing a ratio of integrine and drebrin peptides in the matrix solution, hardening the filaments with phalloidin57 and stabilizing the filament with synthetic mininebuline. Using accessory proteins such as gelsolin, cofilin, formin and myosins would even allow to sped up potential reconfiguration effects enabling to build up a dynamic computing system.15 Dynamical reconfiguration of actin network computers can be used as an advantage for accelerating Boolean satisfiability solvers, reconfigurable data flow machine for implementing atomic functional programming languages, dynamical genetic programming on evolvable Boolean networks and cryptographic applications.

Acknowledgments AA was supported by the EPSRC with grant EP/P016677/1 “Computing with Liquid Marbles”.

References 1. A. Atrubin, A one-dimensional real-time iterative multiplier. IEEE Trans. Electr. Comput. 3, 394–399 (1965). 2. P. C. Fischer, Generation of primes by a one-dimensional real-time iterative array. J. ACM (JACM). 12(3), 388–394 (1965). 3. A. Waksman, An optimum solution to the firing squad synchronization problem. Inform. Control. 9(1), 66–78 (1966). 4. J. K. Park, K. Steiglitz, and W. P. Thurston, Soliton-like behavior in automata. Physica D: Nonlinear Phenomena 19(3), 423–432 (1986). 5. R. K. Squier and K. Steiglitz, Programmable parallel arithmetic in cellular automata using a particle model. Compl. Syst. 8(5), 311–324 (1994). 6. A. Adamatzky, Collision-based computing in biopolymers and their automata models. Int. J. Modern Phys. C 11, 1321–1346 (2000). 7. A. Adamatzky, ed. Collision-based Computing (Springer, 2002).

page 144

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Discovering Boolean Functions on Actin Networks

b4205-v2-ch03

145

8. S. R. Hameroff and S. Rasmussen. Information processing in microtubules: Biomolecular automata and nanocomputers. In Molecular Electronics (Springer, 1989), pp. 243–257. 9. S. Rasmussen, H. Karampurwala, R. Vaidyanath, K. S. Jensen, and S. Hameroff, Computational connectionism within neurons: A model of cytoskeletal automata subserving neural networks. Physica D: Nonlinear Phenomena 42(1–3), 428–449 (1990). 10. S. Hameroff and S. Rasmussen, Microtubule automata: Sub-neural information processing in biological neural networks. In Theoretical Aspects of Neurocomputing (World Scientific, 1990), pp. 3–12 (1990). 11. A. Priel, J. A. Tuszynski, and H. F. Cantiello. The dendritic cytoskeleton as a computational device: an hypothesis. In The Emerging Physics of Consciousness (Springer, 2006), pp. 293–325. 12. F. Straub, Actin, ii. Stud. Inst. Med. Chem. Univ. Szeged 3, 23–37 (1943). 13. E. D. Korn, Actin polymerization and its regulation by proteins from nonmuscle cells. Physiol. Rev. 62(2), 672–737 (1982). 14. A. G. Szent-Gy¨ orgyi, The early history of the biochemistry of muscle contraction. The J. Gen. Physiol. 123(6), 631–641 (2004). 15. F. Huber, J. Schnauß, S. Rnicke, P. Rauch, K. Mller, C. Ftterer, and J. A. K¨ as, Emergent complexity of the cytoskeleton: from single filaments to tissue. Adv. Phys. 62(1), 1–112 (2013). 16. T. Golde, C. Huster, M. Glaser, T. Hndler, H. Herrmann, J. A. Ks, and J. Schnau, Glassy dynamics in composite biopolymer networks. Soft Matter 14(39), 7970–7978 (2018). doi:10.1039/c8sm01061g. 17. J. Tuszy´ nski, S. Hameroff, M. Satarić, B. Trpisova, and M. Nip, Ferroelectric behavior in microtubule dipole lattices: implications for information processing, signaling and assembly/disassembly. J. Theoret. Biol. 174(4), 371–380 (1995). 18. J. Tuszynski, T. Luchko, E. Carpenter, and E. Crawford, Results of molecular dynamics computations of the structural and electrostatic properties of tubulin and their consequences for microtubules. J. Comput. Theoret. Nanosci. 1(4), 392–397 (2004). 19. J. Tuszy´ nski, S. Portet, J. Dixon, C. Luxford, and H. Cantiello, Ionic wave propagation along actin filaments. Biophys. J. 86(4), 1890–1903 (2004). 20. J. Tuszy´ nski, J. Brown, E. Crawford, E. Carpenter, M. Nip, J. Dixon, and M. Satarić, Molecular dynamics simulations of tubulin structure and calculations of electrostatic properties of microtubules. Math. Comput. Modell. 41(10), 1055–1070 (2005). 21. A. Priel, J. A. Tuszynski, and H. F. Cantiello, Ionic waves propagation along the dendritic cytoskeleton as a signaling mechanism. Adv. Molecular Cell Biol. 37, 163–180 (2006). 22. J. Tuszy´ nski, S. Portet, and J. Dixon, Nonlinear assembly kinetics and mechanical properties of biopolymers. Nonlin. Anal.: Theory, Meth. Appl. 63(5–7), 915–925 (2005).

page 145

August 3, 2021

146

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

A. Adamatzky et al.

23. M. Satarić, D. Ilić, N. Ralević, and J. A. Tuszynski, A nonlinear model of ionic wave propagation along microtubules. Eur. Biophys. J. 38(5), 637–647 (2009). ˇ 24. M. Satarić, D. Sekulić, and M. Zivanov, Solitonic ionic currents along microtubules. J. Comput. Theoret. Nanosci. 7(11), 2281–2290 (2010). 25. M. Satarić and B. Satarić. Ionic pulses along cytoskeletal protophilaments. J. Phys.: Conf. Ser. 329, 012009 (2011). 26. L. Kavitha, E. Parasuraman, A. Muniyappan, D. Gopi, and S. Zdravković, Localized discrete breather modes in neuronal microtubules. Nonline. Dynam. 88(3), 2013–2033 (2017). 27. A. Adamatzky, J. Tuszynski, J. Pieper, D. V. Nicolau, R. Rinalndi, G. Sirakoulis, V. Erokhin, J. Schnauss, and D. M. Smith. Towards cytoskeleton computers. A proposal. In A. Adamatzky, S. Akl, and G. Sirakoulis (eds.), From Parallel to Emergent Computing (CRC Group/Taylor & Francis, 2019). 28. E. C. Lin and H. F. Cantiello, A novel method to study the electrodynamic behavior of actin filaments. evidence for cable-like properties of actin. Biophys. J. 65(4), 1371 (1993). 29. J. S. Lorenz, J. Schnauß, M. Glaser, M. Sajfutdinow, C. Schuldt, J. A. K¨ as, and D. M. Smith, Synthetic transient crosslinks program the mechanics of soft, biopolymer-based materials. Adv. Mater. 30(13), 1706092 (2018). 30. G. Giannone, B. J. Dubin-Thaler, O. Rossier, Y. Cai, O. Chaga, G. Jiang, W. Beaver, H.-G. D¨ obereiner, Y. Freund, G. Borisy et al., Lamellipodial actin mechanically links myosin activity with adhesion-site formation. Cell 128(3), 561–575 (2007). 31. W. T. Huck, Responsive polymers for nanoscale actuation. Mater. Today 11(7-8), 24–32 (2008). 32. S. Siccardi, J. A. Tuszynski, and A. Adamatzky, Boolean gates on actin filaments. Phys. Lett. A 380(1-2), 88–97 (2016). 33. S. Siccardi and A. Adamatzky, Logical gates implemented by solitons at the junctions between one-dimensional lattices. Int. J. Bifurc. Chaos 26(06), 1650107 (2016). 34. A. Adamatzky, Logical gates in actin monomer. Sci. Rep. 7(1), 11755 (2017). 35. A. Adamatzky, On discovering functions in actin filament automata. (2018), arXiv preprint arXiv:1807.06352. 36. A. Adamatzky, F. Huber, and J. Schnauß, Computing on actin bundles network. Sci. Rep. 9(1), 15887 (2019). doi:10.1038/s41598-019-51354-y. URL https://doi.org/10.1038/s41598-019-51354-y. 37. A. Adamatzky, J. Schnauß, and F. Huber, Actin droplet machine. Roy. Soc. Open Sci. 6(12), 191135 (2019). doi:10.1098/rsos.191135. URL https://royalsocietypublishing.org/doi/abs/10.1098/rsos.191135. 38. J. Schnau, T. Golde, C. Schuldt, B. U. S. Schmidt, M. Glaser, D. Strehle, T. Hndler, C. Heussinger, and J. A. Ks, Transition from a linear to a harmonic potential in collective dynamics of a multifilament actin bundle. Phys. Rev. Lett. 116(10), 108102 (2016).

page 146

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Discovering Boolean Functions on Actin Networks

b4205-v2-ch03

147

39. J. Schnauß, T. H¨ andler, and J. A. Ks, Semiflexible biopolymers in bundled arrangements. Polymers 8(8), 274 (2016). ISSN 2073-4360. doi:10.3390/polym8080274. URL http://www.mdpi.com/2073-4360/8/8/ 274. 40. M. Glaser, J. Schnauß, T. Tschirner, B. U. S. Schmidt, M. MoebiusWinkler, J. A. K¨ as, and D. M. Smith, Self-assembly of hierarchically ordered structures in dna nanotube systems. New J. Phys. 18(5), 055001 (2016). URL http://stacks.iop.org/1367-2630/18/i=5/a=055001. 41. F. Huber, D. Strehle, J. Schnauß, and J. K¨ as, Formation of regularly spaced networks as a general feature of actin bundle condensation by entropic forces. New J. Phys. 17(4), 043029 (2015). 42. A. Adamatzky, On discovering functions in actin filament automata (Version 1). Zenodo. (2018). http://doi.org/10.5281/zenodo.1312141 43. S. Siccardi, A. Adamatzky, J. Tuszy´ nski, F. Huber, and J. Schnauß, Actin networks voltage circuits. Phys. Rev. E 101, 052314 (May, 2020). doi:10.1103/PhysRevE.101.052314. URL https://link.aps.org/doi/10.1103/ PhysRevE.101.052314. 44. J. J. A. Tuszynski, M. Sataric, D. Sekulic, B. Sataric, and Z. S., Nonlinear calcium ion waves along actin filaments control active hair–bundle motility. (2018), bioxiv.org. doi.org/10.1101/292292. 45. Z. Ma, J. Wang, and H. Guo, Weakly nonlinear ac response: Theory and application. Phys. Rev. B 59(11), 7575–7578 (1999). 46. B. G. Wang, X. Zhao, J. Wang, and H. Guo, Nonlinear quantum capacitance. Appl. Phys. Lett. 74(19), 2887–2889 (1999). 47. T. E. Angelini, H. Liang, W. Wriggers, and G. C. L. Wong, Like-charge attraction between polyelectrolytes induced by counterion charge density waves. Proc. Nat. Acad. Sci. 100(15), 8634–8637 (2003). 48. S. Asakura and F. Oosawa, On interaction between two bodies immersed in a solution of macromolecules. The J. Chem. Phys. 22(7), 1255–1256 (1954). doi:10.1063/1.1740347. URL https://doi.org/10.1063/1.1740347. 49. S. Asakura and F. Oosawa, Interaction between particles suspended in solutions of macromolecules. J. Polym. Sci. 33(126), 183–192 (1958). doi:10.1002/pol.1958.1203312618. URL https://onlinelibrary.wiley.com/doi/ abs/10.1002/pol.1958.1203312618. 50. J. D. Hunter, Matplotlib: A 2d graphics environment. Comput. Sci. Eng. 9(3), 90–95 (2007). doi:10.1109/MCSE.2007.55. 51. P. Virtanen, R. Gommers, T. E. Oliphant, M. Haberland, T. Reddy, D. Cournapeau, E. Burovski, P. Peterson, W. Weckesser, J. Bright, S. J. van der Walt, M. Brett, J. Wilson, K. Jarrod Millman, N. Mayorov, A. R. J. ˙ Polat, Y. Feng, E. W. Nelson, E. Jones, R. Kern, E. Larson, C. Carey, I. Moore, J. Vand erPlas, D. Laxalde, J. Perktold, R. Cimrman, I. Henriksen, E. A. Quintero, C. R. Harris, A. M. Archibald, A. H. Ribeiro, F. Pedregosa, P. van Mulbregt, and S. . . Contributors, SciPy 1.0–Fundamental Algorithms for Scientific Computing in Python (2019), arXiv e-prints. art. arXiv:1907.10121.

page 147

August 3, 2021

148

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch03

A. Adamatzky et al.

52. D. Strehle, J. Schnauß, C. Heussinger, J. Alvarado, M. Bathe, J. A. K¨ as, and B. Gentry, Transiently crosslinked f-actin bundles. Eur. Biophys. J. 40(1), 93–101 (2011). 53. D. Strehle, P. Mollenkopf, M. Glaser, T. Golde, C. Schuldt, J. A. K¨ as, and J. Schnauß, Single actin bundle rheology. Molecules 22(10), 1804 (2017). ISSN 1420-3049. doi:10.3390/molecules22101804. URL http://www.mdpi. com/1420-3049/22/10/1804. 54. F. Huber, D. Strehle, and J. K¨ as, Counterion-induced formation of regular actin bundle networks. Soft Matter. 8, 931–936 (2012). doi:10.1039/ C1SM06019H. URL http://dx.doi.org/10.1039/C1SM06019H. 55. I. Elbalasy, P. Mollenkopf, C. Tutmarc, H. Herrmann, and J. Schnauß, Keratins determine network stress responsiveness in reconstituted actinkeratin filament systems. Soft Matter. 17, 3954–3962 (2021). https://doi. org/10.1039/D0SM02261F 56. T. Golde, M. Glaser, C. Tutmarc, I. Elbalasy, C. Huster, G. Busteros, D. M. Smith, H. Herrmann, J.A. K¨ as, and J. Schnauß, The role of stickiness in the rheology of semiflexible polymers. Soft Matter. 15, 4865–4872 (2019). https://doi.org/10.1039/C9SM00433E 57. C. Schuldt, J. Schnauß, T. H¨ andler, M. Glaser, J. Lorenz, T. Golde, J. A. K¨ as and D. M. Smith, Turning synthetic semiflexible networks by bending stiffness. Phy. Rev. Lett. 117, 197801 (2016). doi: 10.1103/PhysRevLett.117.197801.

page 148

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch04

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235740 0004

Chapter 4

Implication and Not-Implication Boolean Logic Gates Mimicked with Enzyme Reactions — General Approach and Application to Signal-Triggered Biomolecule Release Processes Evgeny Katz Department of Chemistry and Biomolecular Science, Clarkson University, Potsdam, NY 13699, USA [email protected] The enzyme system mimicking Implication (IMPLY) and Not-IMPLY Boolean logic gates has been designed. The same enzyme system was used to operate as the IMPLY or Not-IMPLY gate simply by reformulating the input signals. The optical analysis of the logic operation confirmed the output generation as expected for the studied logic gates. The enzyme system was integrated with a hydrogel matrix releasing DNA molecules as the result of the logic operation, thus extending the logic operation to the actuation process. The conceptual approach to the IMPLY and Not-IMPLY logic gates allows their construction with many other enzymes operating in a similar way.

4.1. Introduction Molecular1–4 and biomolecular5 computing, as a subarea of unconventional computing,6 have attracted high attention and rapidly progressed in the last two decades. Enzyme-based logic systems,7 together with DNA/RNA computing systems,8–10 are the most important areas of research in the general framework on the biomolecular information processing systems. Enzyme-catalyzed reactions, 149

page 149

August 3, 2021

150

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch04

E. Katz

including sophisticated multi-step/multi-enzyme reaction cascades, have been used to mimic almost all known Boolean logic gates,7, 11 such as Yes (Identity), Not (Inversion), OR, NOR, XOR, NXOR, AND, NAND, INHIB, including reversible logic gates,12, 13 such as Feynman, Toffoli, Fredkin and Peres gates. While the vast majority of Boolean logic gates mimicked with enzyme systems have been already formulated,7, 11 optimized14–17 and reported for different applications,18–20 some unique logic operations still require additional studies. Particularly, Implication (IMPLY) logic gate (also named “material implication”) has not achieved enough attention. This gate was first described more than 100 years ago and recognized as a basic logic operation.21 For some technical reasons this gate was ignored in digital electronics for many years, but recently interest to this gate has been returned based on its straightforward realization in memristive switches.22 The IMPLY logic gate has been realized in many chemical systems,23–36 including biomolecular systems based on DNA molecules,37–49 particularly operated as cell biology processes.50 However, the IMPLY gate was rarely demonstrated with enzyme-catalyzed reactions.51

4.2. Mimicking IMPLY Logic Gate The present study aimed at the general approach to the IMPLY logic gate. This approach allows the use of various enzymes performing similar reactions activated with different input signals. Notably, the negative IMPLY gate (Not-IMPLY) is an Inhibited (INHIB) Boolean logic gate. The developed approach also allowed the INHIB logic gate mimicked by the same general system with reformulated input signals. Both logic gates: IMPLY and INHIB were used to trigger biomolecule release from an alginate hydrogel matrix. The IMPLY gate is a digital logic gate that implements a logical conditional,52 which is read “If A, then B,” and is denoted by A ⊃ B or A → B. The truth or falsity of the compound proposition A ⊃ B depends not on any relationship between the meanings of the propositions, but only on the truth-values of A and B; A ⊃ B is false (digital 0) when A is true (1) and B is false (0), and it is

page 150

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch04

Implication and Not-Implication Boolean Logic Gates Mimicked

(a)

151

(b)

Figure 4.1. (a) The truth table for the IMPLY logic gate. (b) The enzyme-based catalytic system mimicking the IMPLY logic gate operation.

true (1) in all other cases. This logic operation is illustrated with an IMPLY truth table in Figure 4.1(a). From the chemical point of view, the first look at the IMPLY truth table results in confusion. Indeed, the absence of both input signals (0,0 input combination) and their presence (1,1 input combination) result in the same “truth” (1) output signal. However, surprisingly, this logic function can be realized in a simple way using enzyme reactions, Figure 4.1(b). The constant (non-variable) “machinery” part of the system includes three enzymes: hexokinase (HK), glucose oxidase (GOx), and lactate oxidase (LOx), and glucose as a substrate (note that O2 was always present because the experiments were performed under air). Variable (binary; YES/NO) input signals, adenosine triphosphate (ATP; Input A) and lactate (Lac; Input B), were defined as experimentally optimized reactant concentrations for their logic values 1 and as their complete absence (physically zero concentration) for the logic value 0. All reactants (the “machinery” and variable signals) were applied in a solution (see the appendix section for specific composition of the solutions). The system operated in the following way: When both inputs were applied at 0,0 combination (in the absence of ATP and Lac) only GOx was activated due to constant presence of glucose (Glc, note that glucose is always present as a part of the “machinery” system), thus resulting in H2 O2 production defined as logic output 1.

page 151

August 3, 2021

152

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch04

E. Katz

When the input signals were applied at 0,1 combination (absence of ATP, but presence of Lac) two enzymes, GOx and LOx were activated, both resulting in the formation of H2 O2 (output 1). The situation was different when the signals were applied at 1,0 combination (presence of ATP and absence of Lac). In this case, in the presence of ATP, HK rapidly consumed glucose (note that the activity of HK was much higher than activity of GOx). Therefore, in the absence of Lac and upon rapid consumption of Glc, the LOx and GOx reactions were inhibited and H2 O2 was not produced (output 0). Finally, when the signal combination was 1,1 (in the presence of ATP and Lac) H2 O2 was produced through the LOx reaction (output 1), while the GOx reaction was inhibited (remember that Glc was consumed by the HK reaction in the presence of ATP). Overall, the enzyme reactions shown in Figure 4.1(b) followed the logic operation expected for the IMPLY logic gate summarized in the truth table, Figure 4.1(a). While the H2 O2 formation was considered as a primary output signal generated by the enzyme reactions, the final experimentally measurable output signal might be different. In the present example, Figure 4.1(b), an additional enzyme reaction catalyzed by horseradish peroxidase (HRP) was used to produce an optically readable output. The HRP-catalyzed oxidation of 2, 2 -azino-bis(3-ethylbenzothiazoline-6-sulfonic acid) (ABTS) in the presence of H2 O2 resulted in absorbance increase at 420 nm defined as the output signal 1 when the absorbance was above an experimentally defined threshold. ABTS was not oxidized in the absence of H2 O2 , thus resulting in a low absorbance below the threshold corresponding to the output 0. Figure 4.2(a) shows the experimentally measured absorbance spectra when the input signals (ATP and Lac) were applied in four different combinations: 0,0; 0,1; 1,0 and 1,1. The optical absorbance measurements are also summarized in the bar chart, Figure 4.2(b). It should be noted that the HRP and ABTS components are not mandatory parts of the logic gate “machinery” and needed only for the optical analysis of the output signal. They are not needed if the H2 O2 primary output signal is analyzed in a different way, for example electrochemically. Importantly, the same concept can be realized with many different

page 152

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch04

Implication and Not-Implication Boolean Logic Gates Mimicked

(a)

153

(b)

Figure 4.2. (a) The optical absorbance spectra obtained upon realization of IMPLY gate with application of various input signal combinations: (a) 0,0; (b) 0,1; (c) 1,0; (d) 1,1. The spectra were measured after 7 min from the input signal application. (b) The bar chart demonstrating the optical absorbance measured at λmax = 420 nm produced after 7 min from the input signal application in four different signal combinations. The dashed-line is a threshold separating logic 0 (below the line) and logic 1 (above the line) values. Error bars correspond to three different repeated experiments.

enzymes and substrate-signals. In general, the concept requires two enzymes operating in parallel (GOx and LOx in the exemplified system, Figure 4.1(b)) and another enzyme rapidly consuming a substrate (Glc in the present example). This can be achieved with various oxidases (GOx and LOx are examples only) or with NAD+ dependent enzymes resulting in NADH production instead of H2 O2 . Obviously, an enzyme selected for consumption of a substrate should be chosen depending on the substrate used (HK was used for the glucose consumption, as an example). 4.3. Mimicking INHIB Logic Gate The negative IMPLY logic operation (Not-IMPLY) corresponds to the Inhibited (INHIB) logic gate, which was already realized in various enzyme-based systems.11, 53, 54 The INHIB gate produces the output signal 1 only when the inhibiting signal A appears at its logic value 0 and the value signal B appears at its logic value 1. Otherwise, the gate returns output 0, as shown in the truth table, Figure 4.3(a). The INHIB logic gate is a negative function for the IMPLY gate. The general approach to the IMPLY gate discussed

page 153

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch04

E. Katz

154

(a)

(b)

Figure 4.3. (a) The truth table for the INHIB (Not-IMPLY) logic gate. (b) The enzyme-based catalytic system mimicking the INHIB logic gate operation.

above can be immediately applied to perform the INHIB logic operation using the same enzyme system and only reformulated input signals, Figure 4.3(b). In the new realization, ATP (Input A) is used as an inhibiting input, while Glc (Input B) is a value input. When both inputs are applied at 0,0 combination (both are absent), all enzymes in the system are mute because they do not have corresponding substrates and the output signal 0 is generated. When ATP is absent, but Glc is present (0,1 input combination) the GOx-catalyzed reaction is activated and H2 O2 produced and visualized through ABTS oxidation catalyzed by HRP (output 1). When ATP is present, but Glc is absent (1,0 input combination) the GOx reaction cannot proceed since the Glc substrate is absent, thus producing output 0. Finally, the most important, in the presence of both ATP and Glc (1,1 input combination), the output is also 0 because the HK reaction in the presence of ATP consumes rapidly Glc, thus inhibiting H2 O2 production (note that the HK activity is much higher than the GOx activity). The optical absorbance spectra and bar chart showing maximum absorbance values are shown for four input signal combinations: 0,0; 0,1; 1,0 and 1,1 in Figures 4.4(a) and 4.4(b), respectively. It should be noted that LOx is not needed in the present realization of the INHIB gate and it is preserved in the

page 154

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch04

Implication and Not-Implication Boolean Logic Gates Mimicked

(a)

155

(b)

Figure 4.4. (a) The optical absorbance spectra obtained upon realization of INHIB gate with application of various input signal combinations: (a) 0,0; (b) 0,1; (c) 1,0; (d) 1,1. The spectra were measured after 7 min from the input signal application. (b) The bar chart demonstrating the optical absorbance measured at λmax = 420 nm produced after 7 min from the input signal application in four different signal combinations. The dashed-line is a threshold separating logic 0 (below the line) and logic 1 (above the line) values. Error bars correspond to three different repeated experiments.

system only for demonstrating that exactly the same enzyme system can be used for both logic operations. 4.4. Using the IMPLY and INHIB Logic Gates for Stimulating Molecule Release Function Various enzyme-based logic systems have been integrated with different bioelectronic20, 55 and biomolecule56 devices, particularly including signal-triggered biomolecule release systems.57 Among many other biomolecule release systems,58–60 release of biomolecules from alginate hydrogel matrices is an interesting and powerful option, which was studied experimentally and modelled theoretically.61 It is particularly important that release of biomolecules entrapped in an alginate polymer matrices can be stimulated with biomolecule signals logically processed through enzyme-catalyzed reactions associated with the alginate hydrogels.62, 63 Enzymes logically processing biomolecule signals have been covalently bound to SiO2 nanoparticles (SiO2 -NPs) and then entrapped in alginate hydrogel films electrochemically deposited on an electrode surface.19 Then, the reactions mimicking logic operations resulted in the alginate film dissolution

page 155

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch04

E. Katz

156

and release of the entrapped DNA molecules. In the present work we used the new enzyme systems performing the IMPLY and INHIB logic operations to release entrapped DNA molecules following the concept reported earlier.19 The HK, GOx and LOx enzymes have been bound covalently to silanized SiO2 -NPs, Figure 4.5, and then physically entrapped into an Fe3+ -cross-linked alginate hydrogel film electrochemically deposited on a graphite rod electrode, Figure 4.6. Note that HRP (a)

(b)

Figure 4.5. Functionalization of the SiO2 -NPs: (a) Silanization of the SiO2 -NPs with APTES. (b) Carbodiimide coupling of GOx to the silanized SiO2 -NPs (note that attachment of LOx and HK was performed in the same way).

(a)

(b)

Figure 4.6. (a) Scheme of the Fe3+ -cross-linked alginate gel electrochemically deposited on a graphite electrode surface. (b) The microscope image of the graphite electrode coated with the Fe3+ -cross-linked alginate film. Note that this image was obtained for the alginate film without entrapped enzyme-SiO2 -NPs and DNA-FAM.

page 156

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch04

Implication and Not-Implication Boolean Logic Gates Mimicked

157

and ABTS were not used in the system and the in situ produced H2 O2 was the final output signal generated by the enzyme system. DNA molecules labeled with a fluorescent dye (DNA-FAM; see its structure in Figure 4.7) were co-entrapped in the alginate film. In order to minimize uncontrolled leakage of DNA-FAM, the alginate film was coated with poly(ethyleneimine) (PEI; see its structure in Figure 4.7), then carboxylic groups of alginate were covalently bound to amino groups of PEI for the additional stabilization of the protecting polymer film. The H2 O2 enzymatically produced inside the alginate hydrogel film resulted in the film dissolution and release of the DNA-FAM molecules. The mechanism of alginate hydrogel decomposition/dissolution in the presence of H2 O2 is well known and (A)

(B)

(C)

Figure 4.7. (A) The signal-triggered DNA-FAM release from the Fe3+ -crosslinked alginate hydrogel film produced electrochemically at an electrode surface. The enzymes logically processing the input signals were bound covalently to SiO2 -NPs and entrapped in the alginate film. (B) The kinetics of the DNA-FAM release upon the system operating as the IMPLY logic gate with the signals applied in different combinations: (a) 0,0; (b) 0,1; (c) 1,0; (d) 1,1. (C) The kinetics of the DNA-FAM release upon the system operating as the INHIB logic gate with the signals applied in different combinations: (a) 0,0; (b) 0,1; (c) 1,0; (d) 1,1. The structures of the fluorescent (FAM) label bound to the DNA and PEI are shown at the right. The definitions of the IMPLY and INHIB logic gates and the input signals are shown in Figures 4.1 and 4.3, respectively.

page 157

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch04

E. Katz

158

based on formation of free radicals through a Fenton-type reaction catalyzed by iron cations present in the alginate film,64 Figure 4.7(A). Depending on the kind of the logic system realized (IMPLY or INHIB), the H2 O2 output was generated at different combinations of the input signals, as shown in Figures 4.1 and 4.3. The release of the DNA-FAM to the solution from the disrupted alginate film was analyzed by fluorescence measurements. Figures 4.7(B) and 4.7(C) show the kinetics of the DNA-FAM release process upon application of four input signal combinations for IMPLY and INHIB gates, respectively. Notably, the observed DNA-FAM release followed the logic operations expected for the mimicked logic gates. 4.5. Conclusions In the summary, we designed a universal enzyme system mimicking the IMPLY and INHIB logic operations. Importantly, the general approach allows similar logic operations with the use of many different enzymes (for example, NAD+ /NADH-dependent dehydrogenases, Figure 4.8) performing logic operations in a similar way. The output signal in the form of in situ produced H2 O2 was used to stimulate biomolecule release following the IMPLY and INHIB logic

(a)

(b)

Figure 4.8. Realization of the IMPLY (a) and INHIB (b) logic gates using NAD+ /NADH-dependent dehydrogenases. Note that the enzyme systems operate in the way similar to the systems shown in Figures 4.1 and 4.3. The output signal is defined as the production of NADH.

page 158

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch04

Implication and Not-Implication Boolean Logic Gates Mimicked

159

operations. While the example system was used for the DNA release to illustrate the concept, other biomolecules or drugs (e.g., insulin or lysozyme) can be released in the same way.61, 65

Appendix Abbreviations used (some of them appear only in the figures): 2, 2 -azino-bis(3-ethylbenzothiazoline-6-sulfonic acid) oxidized form of ABTS ABTSox ADP adenosine diphosphate ATP adenosine triphosphate DNA deoxyribonucleic acid DNA-FAM DNA labeled with a fluorescent dye EDC (1-ethyl-3[3-(dimethylamino)propyl] carbodiimide GDH glucose dehydrogenase, NAD+ /NADH-dependent enzyme Glc β-D-glucose Glc6P glucose-6-phosphate GlcA gluconic acid (product of Glc oxidation) GOx glucose oxidase (enzyme) HEPES-buffer 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid HK hexokinase (enzyme) HRP horseradish peroxidase (enzyme) LDH lactate dehydrogenase, NAD+ /NADH-dependent enzyme LOx lactate oxidase (enzyme) NHS N -hydroxysuccimide NPs nanoparticles PEI poly(ethyleneimine) Pyr pyruvate RNA Ribonucleic acid TRIS-buffer 2-amino-2-(hydroxymethyl)propane-1,3-diol ABTS

page 159

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

160

b4205-v2-ch04

E. Katz

IMPLY and INHIB logic gates composition The reactant solution included: TRIS-buffer, 25 mM, pH = 7.0. The “machinery” (non-variable) part of the both (IMPLY and INHIB) logic gates included: HK: 5 U per mL; GOx: 0.5 U per mL; LOx: 0.5 U per mL (note that LOx is not needed for the INHIB operation, but can be added for consistency with the IMPLY gate); HRP: 5 U per mL; Glc: 0.02 mM; ABTS: 0.0125 mM; O2 : under equilibrium with air The variable part of IMPLY logic gate: Input A (ATP): logic 0 − 0 mM (complete absence); logic 1 − 0.5 mM Input B (Lac): logic 0−0 mM (complete absence); logic 1−0.005 mM The variable part of INHIB logic gate: Input A (ATP): logic 0 − 0 mM (complete absence); logic 1 − 0.5 mM Input B (Glc): logic 0 − 0 mM (complete absence); logic 1 − 0.02 mM. Acknowledgments This work was supported by Human Frontier Science Program, project grant RGP0002/2018 to EK. References 1. K. Szacilowski, Infochemistry — Information Processing at the Nanoscale (Wiley, Chichester, 2012). 2. A. P. de Silva, Molecular Logic-Based Computation (Royal Society of Chemistry, Cambridge, 2013). 3. E. Katz (ed.), Molecular and Supramolecular Information Processing — From Molecular Switches to Logic Systems (Willey-VCH, Weinheim, 2012). 4. T. Sienko (ed.). Molecular Computing (MIT Press, Cambridge, MA, USA, 2003). 5. E. Katz (ed.), Biomolecular Information Processing — From Logic Systems to Smart Sensors and Actuators (Willey-VCH, Weinheim, 2012). 6. A. Adamatzky (ed.), Advances in Unconventional Computing, in series: Emergence, Complexity and Computation, 2 volumes (Springer, Switzerland, 2017). 7. E. Katz, Enzyme-Based Computing Systems (Wiley-VCH, 2019).

page 160

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch04

Implication and Not-Implication Boolean Logic Gates Mimicked

161

8. M. N. Stojanovic, D. Stefanovic, and S. Rudchenko, Acc. Chem. Res. 47, 1845–1852 (2014). 9. Z. Ezziane, Nanotechnology 17, R27–R39 (2006). 10. Z. Xie, L. Wroblewska, L. Prochazka, R. Weiss, and Y. Benenson, Science 333, 1307–1311 (2011). 11. E. Katz, ChemPhysChem 20, 9–22 (2019). 12. B. E. Fratto and E. Katz, ChemPhysChem 16, 1405–1415 (2015). 13. B. E. Fratto and E. Katz, ChemPhysChem 17, 1046–1053 (2016). 14. Y. Filipov, S. Domanskyi, M. L. Wood, M. Gamella, V. Privman, and E. Katz, ChemPhysChem 18, 2908–2915 (2017). 15. V. Privman, S. Domanskyi, S. Mailloux, Y. Holade, and E. Katz, J. Phys. Chem. B 118, 12435–12443 (2014). 16. V. Privman, O. Zavalov, L. Hal´ amkov´ a, F. Moseley, J. Hal´ amek, and E. Katz, J. Phys. Chem. B 117, 14928–14939 (2013). 17. S. Bakshi, O. Zavalov, J. Hal´ amek, V. Privman, and E. Katz, J. Phys. Chem. B 117, 9857–9865 (2013). 18. E. Katz, J. Wang, M. Privman, and J. Hal´ amek, Anal. Chem. 84, 5463–5469 (2012). 19. M. Gamella, M. Privman, S. Bakshi, A. Melman, and E. Katz, ChemPhysChem 18, 1811–1821 (2017). 20. E. Katz and M. Pita, Chem. Eur. J. 15, 12554–12564 (2009). 21. A. Whitehead and B. Russell, Principia Mathematica (University Press, Cambridge, 1910). 22. J. Borghetti, G. S. Snider, P. J. Kuekes, J. J. Yang, D. R. Stewart, and R. S. Williams, Nature 464, 873–876 (2010). 23. Q.-Q. Fu, J.-H. Hu, Y. Yao, Z.-Y. Yin, K. Gui, N. Xu, L.-Y. Niu, and Y.-Q. Zhang, J. Photochem. Photobiol. A 391, art. No. 112358 (2020). 24. V. K. Singh, V. Singh, P. K. Yadav, S. Chandra, D. Bano, B. Koch, M. Talat, and S. H. Hasan, J. Photochem. Photobiol. A 384, art. No. UNSP 112042 (2019). 25. S. Liao, X. Li, H. Yang, and X. Chen, Talanta 194, 554–562 (2019). 26. Y.-M. Zhang, W. Zhu, W.-J. Qu, H.-L. Zhang, Q. Huang, H. Yao, T.-B. Wei, and Q. Lin, J. Luminescence 202, 225–231 (2018). 27. L. Tang, S. Mo, S. G. Liu, L. L. Liao, N. B. Li, and H. Q. Luo, Sens. Actuat. B 255, 754–762 (2018). 28. K. D. Renuka, C. L. Lekshmi, K. Joseph, and S. Mahesh, Chem. Select 2, 11615–11619 (2017). 29. S. G. Liu, N. Li, Y. Z. Fan, N. B. Li, and H. Q. Luo, Sens. Actuat. B 243, 634–641 (2017). 30. W.-T. Li, G.-Y. Wu, W.-J. Qu, Q. Li, J.-C. Lou, Q. Lin, H. Yao, Y.-M. Zhang, and T.-B. Wei Sens. Actuat. B 239, 671–678 (2017). 31. G. Wang, H. Chen, Y. Chen, and N. Fu, Sens. Actuat. B 233, 550–558 (2016). 32. T. Liu, N. Li, J. X. Dong, H. Q. Luo, and N. B. Li, Sens. Actuat. B 231, 147–153 (2016).

page 161

August 3, 2021

162

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch04

E. Katz

33. G. Singh, J. Singh, J. Singh, and S. S. Mangat, J. Luminescence 165, 123– 129 (2015). 34. A. Kuwar, R. Patil, A. Singh, S. K. Sahoo, J. Marek, and N. Singh, J. Mater. Chem. C 3, 453–460 (2015). 35. J. Wu, Y. Gao, J. Lu, J. Hu, and Y. Ju, Sens. Actuat. B 206, 516–523 (2015). 36. X. Tian, Z. Dong, J. Hou, R. Wang, and J. Ma, J. Luminescence 145, 459– 465 (2014). 37. X. Lin, Z. Hao, H. Wu, M. Zhao, X. Gao, S. Wang, and Y. Liu, Microchim. Acta 186, art. No. UNSP 648 (2019). 38. S. Ma, Q. Zhang, D. Wu, Y. Hu, D. Hu, Z. Guo, S. Wang, Q. Liu, and J. Peng, J. Electroanal. Chem. 847, art. No. UNSP 113144 (2019). 39. J. Chen, J. Pan, and S. Chen, Chem. Sci. 9, 300–306 (2018). 40. Y.-C. Chen, C. W. Wang, J. D. Lee, P.-C. Chen, and H.-T. Chang, J. Chinese Chem. Soc. 64, 8–16 (2017). 41. Y.-F. Huo, L.-N. Zhu, X.-Y. Li, G.-M. Han, and D.-M. Kong, Sens. Actuat. B 237, 179–189 (2016). 42. L. Ge, W. Wang, X. Sun, T. Hou, and F. Li, Anal. Chem. 88, 9691–9698 (2016). 43. X.-Y. Li, J. Huang, H.-X. Jiang, Y.-C. Du, G.-M. Han, and D.-M. Kong, RSC Adv. 6, 38315–38320 (2016). 44. R.-R. Gao, S. Shi, Y. Zhu, H.-L. Huang, and T.-M. Yao, Chem. Sci. 7, 1853– 1861 (2016). 45. W. Gao, L. Zhang, R.-P. Liang, and J.-D. Qiu, Chem. – Eur. J. 21, 15272– 15279 (2015). 46. Y. Jiang, N. Liu, W. Guo, F. Xia, and L. Jiang, J. Am. Chem. Soc. 134, 15395–15401 (2012). 47. K. S. Park, M. W. Seo, C. Jung, J. Y. Lee, and H. G. Park, Small 8, 2203– 2212 (2012). 48. Y. Jiang, N. Liu, W. Guo, F. Xia, and L. Jiang, J. Am. Chem. Soc. 134, 15395–15401 (2012). 49. J.-H. Guo, D.-M. Kong, and H.-X. Shen, Biosens. Bioelectron. 26, 327–332 (2010). 50. T. Miyamoto, S. Razavi, R. DeRose, and T. Inoue, ACS Synth Biol. 2, 72–82 (2013). 51. C. Chen, D. Zhao, Y. Jiang, P. Ni, C. Zhang, B. Wang, F. Yang, Y. Lu, and J. Sun, Anal. Chem. 91, 15017–15024 (2019). 52. F. M. Brown, Boolean Reasoning: The Logic of Boolean Equations, 1st edn. (Kluwer Academic Publishers, Norwell, MA, 2003). 53. R. Baron, O. Lioubashevski, E. Katz, T. Niazov, and I. Willner, J. Phys. Chem. A 110, 8548–8553 (2006). 54. G. Strack, M. Pita, M. Ornatska, and E. Katz, ChemBioChem 9, 1260–1266 (2008). 55. J. Hal´ amek, T. K. Tam, G. Strack, V. Bocharova, M. Pita, and E. Katz, Chem. Commun. 46, 2405–2407 (2010).

page 162

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch04

Implication and Not-Implication Boolean Logic Gates Mimicked

163

56. B. E. Fratto, J. M. Lewer, and E. Katz, ChemPhysChem 17, 2210–2217 (2016). 57. E. Katz, J. M. Pingarr´ on, S. Mailloux, N. Guz, M. Gamella, G. Melman, and A. Melman, J. Phys. Chem. Lett. 6, 1340–1347 (2015). 58. M. Bellare, V. Krishna Kadambar, P. Bollella, M. Gamella, E. Katz, and A. Melman, Electroanalysis 31, 2274–2282 (2019). 59. M. Bellare, V. Krishna Kadambar, P. Bollella, E. Katz, and A. Melman, ChemElectroChem 7, 59–63 (2020). 60. M. Bonini, D. Berti, and P. Baglioni, Curr. Opin. Colloid Interface Sci. 18, 459–467 (2013). 61. S. Scheja, S. Domanskyi, M. Gamella, K. L. Wormwood, C. C. Darie, A. Poghossian, M. J. Sch¨ oning, A. Melman, V. Privman, and E. Katz, ChemPhysChem 18, 1541–1551 (2017). 62. A. V. Okhokhonin, S. Domanskyi, Y. Filipov, M. Gamella, A. N. Kozitsina, V. Privman, and E. Katz, Electroanalysis 30, 426–435 (2018). 63. Y. Filipov, M. Gamella, and E. Katz, Electroanalysis 30, 1281–1286 (2018). 64. O. Smidsrød, A. Haug, and B. Larsen, Acta Chem. Scand. 19, 143–152 (1965). 65. Z. Jin, A. M. Harvey, S. Mailloux, J. Hal´ amek, V. Bocharova, M. R. Twiss, and E. Katz, J. Mater. Chem. 22, 19523–19528 (2012).

page 163

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch05

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235740 0005

Chapter 5

Molecular Computation via Polymerase Strand Displacement Reactions Shalin Shah∗,† , Ming Yang† , Tianqi Song† and John Reif ∗,†,‡ ∗

Department of Electrical and Computer Engineering, Duke University, Durham, NC 27701, USA † Department of Computer Science, Duke University, Durham, NC 27701, USA ‡ [email protected]

The field of DNA computing has had many major breakthroughs since the early results of Adleman. Recent DNA computing works have focused primarily on enzyme-free computing architectures, that is, DNA-only circuits and devices. The rationale behind engineering such systems is to have biologically simpler machines. This offers several benefits such as no dependency on temperature control systems. However, recently1, 2 the use of the enzyme-based system is gaining momentum yet again. These systems also compute using strand displacement, similar to enzymefree architectures. However, the strand displacement is facilitated with a polymerase enzyme. Such enzymatic use provides an alternative method for the design and development of such systems. In this work, we discuss two computing avenues, namely, DNA circuits and chemical reaction networks along with a section on methods and protocols describing how polymerase-based systems operate. We end this chapter with a brief discussion and future possibilities of such emerging avenues of DNA computing.

5.1. Introduction The field of DNA computing, now more than 25 years old, has come a long way since its first attempt at solving computationally hard 165

page 165

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

166

b4205-v2-ch05

S. Shah et al.

problems.3 Since then numerous computational architectures have been proposed,1, 2, 4–14 including molecular motors and robots,15–19 software packages to design DNA strands,20–26 protocols for liquid handling to design mesoscale structures,27–30 DNA-based imaging architectures,23, 31–35 archival storage drives24, 36–38 and several other related applications. 5.1.1. Logic circuits The fundamental units of DNA computing include DNA strand and some supporting enzymes such as ligase, polymerase, nickase. For example, Adleman used DNA hybridization and ligation to solve the traveling salesman problem for a directed graph.3 Several recent architectures use enzyme-free DNA only strand displacement logic circuits.11, 12, 39 Some robotic architectures use DNAzyme and an RNA target to implement a walker.17, 40 The popular DNA PEN toolbox uses three enzymes, namely, polymerase, exonuclease and nicking enzyme in addition to DNA strands for their computational ability.8 Numerous other architectures have also been proposed which demonstrates this power and programmability of DNA computing systems. This work, in particular, will focus on the recent polymerase-based single-stranded DNA architecture2 that uses strand displacing polymerase to compute logic circuits. 5.1.2. Chemical reaction networks Chemical reaction networks (CRNs), traditionally, are used to model the specie dynamics of a well-mixed solution occurring in nature (e.g., cell regulation). However, in DNA nanotechnology, mainly due to the phenomenal success of large-scale DNA logic systems,2, 5, 6, 11, 12, 16, 39, 41 it was realized that CRNs can form a programming language for synthetic DNA-based molecular systems.1, 4, 13 This exciting realization has also been supported by rigorous explorations on the computational power of CRNs. It is understood that stochastic CRNs are Turing universal, under certain error tolerance assumptions, while what exactly are the limits of deterministic CRNs.42, 43 More formally, a CRN is defined as a finite set of reactions S over a finite set of non-negative elements

page 166

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch05

Molecular Computation via Polymerase Strand Displacement Reactions

167

E such that reactants get consumed to produce products at some k rate k, that is, R −→ P where R, P ∈ E. A CRN can thus be represented using tuple (S, E, K). Several DNA-based systems have been proposed1, 4, 13, 44 to implement arbitrary CRNs and some of them have also been experimentally verified.5, 39 In this work, we will mainly discuss the most recent scheme that uses a strand displacing polymerase to implement arbitrary CRNs. 5.1.3. Chapter organization The rest of the book chapter is divided as follows: In Section 5.2, we introduce the fundamental unit of computing, that is, DNA strand displacement. This includes both toehold-mediated strand displacement and polymerase-based strand displacement. In Section 5.3, we discuss how a strand displacing polymerase can be used to perform DNA computing by building Boolean logic circuits. Section 5.4 shows how chemical reaction networks can be implemented using DNA substrate and strand-displacing polymerase enzyme. Unlike DNA-only reactions, polymerase reactions don’t operate at room temperature and therefore the careful design of DNA strand and fluorescence setup is required. Section 5.5 describes the methods and protocol to be used. We conclude the chapter with a summary of the work and discussion about future possibilities. 5.2. Strand Displacement Just like a transistor forms a fundamental unit of silicon-based computing, strand displacement in DNA is the fundamental unit of DNA computing. If several thousand strand displacement units are combined, one can, in principle, design a fully-working biocomputer. DNA strand displacement 2, 7–11, 16, 35, 45 is defined as the replacement of an incumbent strand in a dsDNA by an input strand. There are two ways to do strand displacement: (a) Toehold-mediated strand displacement, and (b) Polymerase-based strand displacement. Most prior works relied on the principle of toehold-mediated strand displacement (TMSD)15 as they followed the rationale of biologically simple systems. TMSD is an enzyme-free unit, that is, it only requires DNA strands. Polymerase-based strand displacement (PSD),1 on the

page 167

August 3, 2021

168

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch05

S. Shah et al.

(a)

(b)

Figure 5.1. Toehold-mediated strand displacement versus polymerase-based strand displacement along with the intermediate stages.

other hand, requires polymerase enzyme in addition to DNA strands. A pictorial representation of both techniques is shown in Figure 5.1. In TMSD, a partial dsDNA molecule acts as the gate and the input strand comes in to bind with the dsDNA complex through a short single-stranded opening called toehold. Upon binding with the toehold region, the tug-of-war between input ssDNA and incumbent DNA strand starts. Since the toehold region makes full binding of input strand thermodynamically46 more favorable, the input strand eventually completely displaces the incumbent strand which gets released as the output. This entire process can be abstracted as a pass gate where input strand I releases output strand O using gate complex G. TMSD has been used to construct interesting logic circuits, reaction networks and even targetted drug-delivery applications. More details about the applications of TMSD can be found in Refs. [10, 11, 41, 47]. In PSD, a partial dsDNA molecule with partial single-strand open from the 5-prime direction is used. An input strand can come in and bind to the toehold region and then polymerase enzyme comes in to simultaneously prime the full strand and displace the incumbent strand. The energy source in this scheme is polymerase enzyme as opposed to the branch portion of the input strand. PSD is a relatively old-yet-new idea in the field of DNA nanoscience. While polymerase enzyme has been used in standard techniques such as polymerase

page 168

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch05

Molecular Computation via Polymerase Strand Displacement Reactions

169

chain reaction (PCR) and early DNA computing works, the idea of using polymerase enzyme for DNA computing has faded away in the last few years. Most of the focus has been on enzyme-free DNA circuits as they are biologically simpler. However, there have been few works also exploring the idea of PSD for computing and reaction networks applications.1, 2, 8

5.3. Using Strand Displacing Polymerase for Computation DNA circuits have been increasingly explored over the last decade and the ones that have been implemented are primarily based on TMSD, which is an enzyme-free unit.2, 12, 39 For example, Qian et al. experimentally demonstrate a four-bit square-root circuit that comprises 130 DNA strands and is based on seesaw gates and it takes hours to compute the result.11 Although the principle of TMSD is biologically simple, it can introduce reaction leaks and is sometimes time-consuming especially when the length of the toehold increases. While the slow computing speed as a challenge has been approached from two different directions, namely, high concentration48 and localization,15 our focus in this work is to explore an alternative avenue of PSD that can reduce circuit size. Song et al. proposed a fast and compact DNA logic circuits architecture that uses PSD.2 Their architecture is based on singlestranded DNA logic gates, which largely reduces leakage reactions and signal restoration steps. While PSD is generally more effective at strand displacement as compared to TMSD since the enzyme is an energy source, it should also lead to more leaky reactions since the overall system has higher energy. However, the ssDNA gates are designed cleverly so they mitigate the leak issues without compromising the reaction speed. Therefore, the circuits achieve faster computation speed with fewer DNA strands, which enables easy construction of large-scale logic circuits. In their experiments, they use Bst 2.0 DNA polymerase which has strong strand-displacement activity and a large range of salt tolerance. The active temperature for Bst 2.0 DNA polymerase is around 65◦ C and they conduct

page 169

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

170

S. Shah et al.

(a)

(e)

b4205-v2-ch05

(b) (f)

(c)

(g)

(d)

(h)

Figure 5.2. DNA implementation for logic gates OR and AND. (a) OR gate. (b) Fuel strand (F) is bound to the OR gate by DNA hybridization (DH) and then extended by polymerization (PO) reaction. (c) Input hybridizes with the gate complex and releases the output by PO reaction. (d) Reactions with a reporter complex. Input (O) hybridizes with reporter complex, then separates fluorophores and quenchers by PO reaction and release fluorescence signal (e) AND gate. (f) and (g) Two possible reaction pathways to produce output (AO) or (BO). (h) Side-reactions without any output generation.2 The figure is adopted from Ref. [2] with permission.

the experiments at 55◦ C. All the strand domains they use in the experiments is around 30 nt. The functionality of the OR gate and AND gate is shown in Figure 5.2. Each gate is of the form two-input (A, B) and oneoutput (O). Logic-OR means that if either input strand is present,

page 170

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch05

Molecular Computation via Polymerase Strand Displacement Reactions

171

it should be able to release the output strand. To implement this function, two ssDNA gates are used (refer to Figures 5.2(b) and 5.2(c). If the input follows right order then fuel comes in to produce a partially double stranded complex followed by an input which can release the output strand. The output of the gate is reported using standard fluorescence techniques.1, 5, 11, 12, 39 The functionality of the OR-gate is observed in Figure 5.2(a), where all four inputs were tested. As expected, when either inputs are present, we see output fluorescence, however, when no input is present, we do not observe any output. Similar to the logic-OR gate, the logic-AND gate is also a two-input (A, B), one-output (O) gate. However, the output is triggered only when both the inputs coexist in the solution. The functionality of this gate is shown in Figure 5.2(e). The output fluorescence is only observed in one case where both the inputs are present. Using these simple units of logic computing, the work demonstrated a large-scale circuit that comprises of 37 DNA strands and can compute a 4-bit square-root function with a half-completion time of around 25 min. It is beyond the scope of this chapter to discuss implementation details of the circuits, however, the reader is encouraged to refer to the full paper. These logic circuits demonstrated by Song et al. required less computation time and fewer DNA strands as compared to prior state-of-the-art in the field demonstrating the potential of PSD logic circuits. 5.4. CRNs Using Strand Displacing Polymerase The first step towards designing arbitrary CRNs using a strand displacing polymerase includes unimolecular and bimolecular CRNs. This is because any complex set of CRNs can be broken down into these simplistic CRNs. To implement a unimolecular CRN, Shah et al. proposed a two-step process (refer to Figure 5.3). Let’s k

consider the unimolecular CRN A −→ B where input A produces output B at some rate k. This can be done using two auxiliary complexes which converts the input A to intermediate species I, and intermediate species I to output B. Both the steps use strand displacing polymerase to undergo PSD. It might seem hard

page 171

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

page 172

S. Shah et al.

172

(b) DNA implementation

(a) Unimolecular CRN

i

a

A

b4205-v2-ch05

A+Gai

B

k

I+Gib

q1

I

q2

B

k

C

a

i

a*

i*

i* b

i

b

i*

b*

i*

b*

i

i

b

(d) DNA implementation

(c) Bimolecular CRN

A+B

k/λ

a*

A+Gbi

q1

I

I+Gic

q2

C

B+Gs

q+

Gbi+S

q-

a a* b

i i

*

b

i

b*

i*

γ bk

a

b

i

a*

b*

i*

c

b i

c

c

i*

c*

*

b

i

c

Figure 5.3. Unimolecular and bimolecular reaction along with their low-level DNA implementation. Both reactions are a two-step process where input releases an intermediate strand, and the intermediate strand releases an output upon combining with an auxiliary dsDNA complex. The figure is adopted from Ref. [1] with permission.

to imagine this two-step process approximating the unimolecular reaction, under certain assumptions, it can be proved that these reactions indeed do so. The reader is encouraged to refer to the full paper for details.1 In unimolecular reactions, only one input is required. However, most systems use more than one input species and therefore bimolecular reaction form the simplest non-trivial reaction set. To implement k such a system A + B −→ C, we simply need a dsDNA complex and one of the inputs. If so, then the input strand can combine with the gate complex to produce intermediate species I, which then, in turn, produces output C with the help of auxiliary complex. Such implementation ensures the stoichiometry of both the inputs is maintained and intermediate being produced only in the presence of both the inputs (refer to Figure 5.3). Similar to unimolecular reactions, it is possible to prove that this two-step process along with a supporting linker reaction, can approximate the bimolecular

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch05

Molecular Computation via Polymerase Strand Displacement Reactions

173

reaction. However, it is out-of-scope of this chapter and the reader is encouraged to read the original paper.1 Using these simple systems that implement unimolecular and bimolecular reactions, very complex and interesting phenomenon can be implemented. For example, these systems can be used for population protocols such as molecular consensus. In such a network, as shown in Figure 5.4(a), two molecular species interact with each other to reach a democracy if a majority is found. The in silico demonstration of a molecular consensus network is shown in Figure 5.4(b). Depending on different relative amounts of species A and B, the network, which is a set of CRNs, can compute the majority. Upon computation, all the minority species are consumed and converted to the majority. A more complex system that has also been demonstrated includes a rock-paper-scissor oscillator. Such a molecular protocol is considered complex since they require careful calibration of reaction rates and stoichiometry of species. In such a cyclic system, a rock can beat scissors, a scissor can beat paper and a paper can beat rock. Therefore, such a network, under certain initial conditions, can keep oscillation indefinitely. The DNA implementation of such a system closely approximates the actual oscillatory phenomenon, however, eventually starts diverging after auxiliary gates are consumed (refer to Figure 5.4(d). These example networks demonstrate the power of CRNs and their DNA implementations. Of course, the experimental demonstration of these systems using DNA and PSD will add a cherry to the cake, however, the problem still remains as work-in-progress.1

5.5. Methods and Protocol 5.5.1. Oligonucleotide design, synthesis, and purification It has been long known49 that DNA strand design plays a crucial role in the successful operation of large-scale circuits and therefore utmost care should be taken while designing the strands. Based on

page 173

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch05

S. Shah et al.

174 (minority) (a)

(output) Consensus network

(majority)

10

0

30

0.2

0.4

0.6

Time (hours)

0.8

A0=0.3 B0=0.7

10

0

0.2

0.4

0.6

Time (hours)

0.8

20 10 00

1

20

0

A0=0.8 B0=0.2

30

30

0.2

0.4

0.6

Time (hours)

0.8

A0=0.2 B0=0.8

10

0

0.2

0.4

0.6

Time (hours)

0.8

Y

20 10

0.2

30

0.4

0.6

0.8

1

0.6

0.8

1

Time (hours)

A0=0.1 B0=0.9

20 10 0

1

B

A0=0.7 B0=0.3

30

00

1

20

0

1

Concentration (nM)

20

A

Concentration (nM)

Concentration (nM)

A0=0.9 B0=0.1

30

0

Concentration (nM)

Polymerase-based CRN Consensus network

Concentration (nM)

Concentration (nM)

(b)

0

0.2

0.4

Time (hours)

beats

(c)

beats

Rock

Scisscors

beats

Paper

(d) 11 A B C

9

11 10 9

8 C (nM)

Concentration (nM)

10

7

7 6

6

5

5

4

4 3

8

3 2

0

0.5

1

1.5

2 Time (hours)

2.5

3

3.5

4

4

B (n6 8 M)

10

10 11 6 7 8 9 12 2 3 4 5 A (nM)

Figure 5.4. (a) and (b) A consensus network. (c) and (d) A rock-paper-scissor oscillator. Both the systems were designed and implemented using the strand displacing polymerase scheme. The figure is adopted from Ref. [1] with permission

page 174

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch05

Molecular Computation via Polymerase Strand Displacement Reactions

175

our prior works, here are some general guidelines which can be used to design DNA strands: 1. The GC content for DNA strands should be roughly in the range of 40 and 60% along with no more than three consecutive G, C, A, T nucleotide, also called homopolymers (e.g., CCCC). 2. The toehold region should always be on the 5 end (or 3 exposed region) since the polymerase primes from 5 to 3 direction. The toehold can range from 10 nt to 30 nt depending on the desired reaction rate. 3. Polymerase enzyme has varying activity with temperature and therefore thermal chamber is essential. In the case of Bst polymerase, a temperature of 50–65◦ C is suitable. 4. Use 5–8 nt clamping strategy2, 5, 11 on the ends of dsDNA, that is, use GC-rich ends to avoid DNA breathing from opening downstream gates. It is very well known that once the DNA is synthesized, two rounds of purification is required:5, 39 (a) 12% denaturing PAGE purification, and (b) non-denaturing PAGE. This will greatly mitigate the unwanted reactions, also called leaky50 reactions, and get rid of synthesis errors. The voltage and time for running gels depend on the size of gel and length of DNA, however, 200 V for 30 min can be a good starting point for denaturing gel and 150 V for 4 h for nondenaturing gel. For gate preparations, several protocols are available suggesting different anneal times, however, a simple 2 h anneal cycle with a linear temperature gradient from 95◦ C to 25◦ C works well. 5.5.2. Fluorescence sample preparation and measurement DNA circuits are generally tested using fluorescence spectrometer.2, 5, 7, 9, 11, 12, 19 Some care should be taken here while observing reaction output since polymerase activity is highly correlated with temperature. Prepare the reaction master mix with all the required buffers and DNA strands. This should include dNTPs, polymerase, isothermal amplification buffer, dsDNA gates as well as fluorescent

page 175

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

176

b4205-v2-ch05

S. Shah et al.

DNA reporters. Incubate this solution at the desired temperature in the PCR machine or cuvette for 20 min. Add the input strand quickly to the sample and start measuring the reaction. It should be ensured that the volume of input is minimal as compared to the rest of the sample volume since the input is added at room temperature. For example, 1 uL (1% of total volume) of input can be added to 90 uL of the master mix. This ensures minimal temperature fluctuations upon adding input. 5.6. Discussion and Outlook In this chapter, we introduced an alternative architectural unit for DNA computing, that is, polymerase strand displacement which uses a polymerase enzyme as an energy source for strand displacement. It was compared with the traditional toehold-mediated strand displacement briefly followed by a short review of the state-of-the-art techniques for DNA computing and DNA-based CRNs using a strand displacing polymerase. As with other systems, the biggest challenge to build large-scale polymerase systems would be reaction leak and therefore leakless principles need to be adopted to the polymerase design. However, it largely remains an open problem. However, these new studies1, 2 promise an exciting future of DNA computing since they demonstrate that to engineer large-scale DNA computing systems, there are new research avenues yet to be explored. References 1. S. Shah, T. Song, X. Song, and M. Yang, J. Reif, Implementing arbitrary CRNs using strand displacing polymerase. In: International Conference on DNA Computing and Molecular Programming (Springer, 2019), pp. 21–36. 2. T. Song, A. Eshra, S. Shah, H. Bui, D. Fu, M. Yang, R. Mokhtar, and J. Reif, Fast and compact DNA logic circuits based on single-stranded gates using strand-displacing polymerase. Nat. Nanotechnol. 14(11), 1075–1081 (2019). 3. L. M. Adleman, Molecular computation of solutions to combinatorial problems. Science 266(5187), 1021–1024 (1994). 4. L. Cardelli, Two-domain DNA strand displacement. Math. Struct. Comput. Sci. 23(2), 247–271 (2013). 5. Y. J. Chen, N. Dalchau, N. Srinivas, A. Phillips, L. Cardelli, D. Soloveichik, and G. Seelig, Programmable chemical controllers made from DNA. Nat. Nanotechnol. 8(10), 755–762 (2013).

page 176

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch05

Molecular Computation via Polymerase Strand Displacement Reactions

177

6. K. M. Cherry and L. Qian, Scaling up molecular pattern recognition with DNA-based winner-take-all neural networks. Nature 559(7714), 370 (2018). 7. A. Eshra, S. Shah, T. Song, and J. Reif, Renewable DNA hairpin-based logic circuits. IEEE Trans. Nanotechnol. (2019). 8. T. Fujii and Y. Rondelez, Predator–prey molecular ecosystems. ACS Nano 7(1), 27–34 (2012). 9. S. Garg, S. Shah, H. Bui, T. Song, R. Mokhtar, and J. Reif, Renewable time-responsive DNA circuits. Small 14(33), 1801470 (2018). 10. W. Li, Y. Yang, H. Yan, and Y. Liu, Three-input majority logic gate and multiple input logic circuit based on dna strand displacement. Nano Lett. 13(6), 2980–2988 (2013). 11. L. Qian and E. Winfree, Scaling up digital circuit computation with DNA strand displacement cascades. Science 332(6034), 1196–1201 (2011). 12. G. Seelig, D. Soloveichik, D. Y. Zhang, and E. Winfree, Enzyme-free nucleic acid logic circuits. Science 314(5805), 1585–1588 (2006). 13. D. Soloveichik, G. Seelig, and E. Winfree, DNA as a universal substrate for chemical kinetics. Proc. Nat. Acad. Sci. 107(12), 5393–5398 (2010). 14. X. Song, A. Eshra, C. Dwyer, and J. Reif, Renewable DNA seesaw logic circuits enabled by photoregulation of toehold-mediated strand displacement. RSC Adv. 7(45), 28130–28144 (2017). 15. H. Bui, S. Shah, R. Mokhtar, T. Song, S. Garg, and J. Reif, Localized DNA Hybridization Chain Reactions on DNA Origami. ACS Nano 12(2), 1146– 1155 (2018). 16. J. Chao, J. Wang, F. Wang, X. Ouyang, E. Kopperger, H. Liu, Q. Li, J. Shi, L. Wang, and J. Hu, et al., Solving mazes with single-molecule DNA navigators. Nat. Mater. 18(3), 273 (2019). 17. J. H. Reif, and S. Sahu, Autonomous programmable DNA nanorobotic devices using DNAzymes. Theoret. Comput. Sci. 410(15), 1428–1439 (2009). 18. M. Teichmann, E. Kopperger, and F. C. Simmel, Robustness of localized DNA strand displacement cascades. ACS Nano 8(8), 8487–8496 (2014). 19. A. J. Thubagere, W. Li, R. F. Johnson, Z. Chen, S. Doroudi, Y. L. Lee, G. Izatt, S. Wittman, N. Srinivas, D. Woods, E. Winfree, and L. Qian, A cargo-sorting DNA robot. Science 357(6356) (2017). 20. A. Goyal, D. Limbachiya, S. K. Gupta, F. Joshi, S. Pritmani, A. Sahai, and M. K. Gupta, DNA pen: A tool for drawing on a molecular canvas. (2013), arXiv preprint arXiv:1306.0369. 21. M. R. Lakin, S. Youssef, F. Polo, S. Emmott, and A. Phillips, Visual DSD: a design and analysis tool for DNA strand displacement systems. Bioinformatics 27(22), 3211–3213 (2011). 22. H. W. van Roekel, L. H. Meijer, S. Masroor, Z. C. Felix-Garza, A. EstevezTorres, Y. Rondelez, A. Zagaris, M. A. Peletier, P. A. Hilbers, and T. F. de Greef, Automated design of programmable enzyme-driven DNA circuits. ACS Synth. Biol. 4(6), 735–745 (2014). 23. S. Shah, A. Dubey, and J. Reif, Programming temporal DNA barcodes for single-molecule fingerprinting. Nano Lett. (2019).

page 177

August 3, 2021

178

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch05

S. Shah et al.

24. S. Shah, D. Limbachiya, and M. K. Gupta, DNACloud: A Potential Tool for storing Big Data on DNA. (2013), arXiv preprint arXiv:1310.6992. 25. A. J. Thubagere, C. Thachuk, J. Berleant, R. F. Johnson, D. A. Ardelean, K. M. Cherry, and L. Qian, Compiler-aided systematic construction of largescale dna strand displacement circuits using unpurified components. Nat. Commun. 8, 14373 (2017). 26. J. N. Zadeh, C. D. Steenberg, J. S. Bois, B. R. Wolfe, M. B. Pierce, A. R. Khan, R. M. Dirks, and N. A. Pierce, NUPACK: analysis and design of nucleic acid systems. J. Computat. Chem. 32(1), 170–173 (2011). 27. Y. Ke, L. L. Ong, W. M. Shih, and P. Yin, Three-dimensional structures self-assembled from DNA bricks. science 338(6111), 1177–1183 (2012). 28. L. L. Ong, N. Hanikel, O. K. Yaghi, C. Grun, M. T. Strauss, P. Bron, J. Lai-Kee-Him, F. Schueder, B. Wang, and P. Wang, et al., Programmable self-assembly of three-dimensional nanostructures from 10,000 unique components. Nature 552(7683), 72 (2017). 29. G. Tikhomirov, P. Petersen, and L. Qian, Fractal assembly of micrometrescale DNA origami arrays with arbitrary patterns. Nature 552(7683), 67 (2017). 30. B. Wei, M. Dai, and P. Yin, Complex shapes self-assembled from singlestranded DNA tiles. Nature 485(7400), 623 (2012). 31. R. Jungmann, M. S. Avenda˜ no, M. Dai, J. B. Woehrstein, S. S. Agasti, Z. Feiger, A. Rodal, and P. Yin, Quantitative super-resolution imaging with qPAINT. Nat. Meth. 13(5), 439 (2016). 32. R. Jungmann, M. S. Avenda˜ no, J. B. Woehrstein, M. Dai, W. M. Shih, and P. Yin, Multiplexed 3D cellular super-resolution imaging with DNA-PAINT and Exchange-PAINT. Nat. Meth. 11(3), 313 (2014). 33. C. Lin, R. Jungmann, A. M. Leifer, C. Li, D. Levner, G. M. Church, W. M. Shih, and P. Yin, Submicrometre geometrically encoded fluorescent barcodes self-assembled from DNA. Nat. Chem. 4(10), 832–839 (2012). 34. S. Shah, A. K. Dubey, and J. Reif, Improved optical multiplexing with temporal DNA barcodes. ACS Synth. Biol. 8(5), 1100–1111 (2019). 35. S. Shah, and J. Reif, Temporal DNA barcodes: A time-based approach for single-molecule imaging. In International Conference on DNA Computing and Molecular Programming (Springer, 2018), pp. 71–86. 36. Y. Erlich, and D. Zielinski, DNA fountain enables a robust and efficient storage architecture. Science 355(6328), 950–954 (2017). 37. N. Goldman, P. Bertone, S. Chen, C. Dessimoz, E. M. LeProust, B. Sipos, amd E. Birney, Towards practical, high-capacity, low-maintenance information storage in synthesized dna. Nature 494(7435), 77 (2013). 38. X. Song, and J. Reif, Nucleic acid databases and molecular-scale computing. ACS Nano. 13(6), 6256–6268 (2019). DOI: 10.1021/acsnano.9b02562 39. N. Srinivas, J. Parkin, G. Seelig, E. Winfree, and D. Soloveichik, Enzyme-free nucleic acid dynamical systems. Science 358(6369), eaal2052 (2017). 40. A. Eshra, and A. El-Sayed, An odd parity checker prototype using DNAzyme finite state machine. IEEE/ACM Trans. Comput. Biol. Bioinform. 11(2),

page 178

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch05

Molecular Computation via Polymerase Strand Displacement Reactions

179

316–324 (2013). 41. S. Garg, H. Bui, A. Eshra, S. Shah, and J. Reif, Nucleic Acid Hairpins: A Robust and Powerful Motif for Molecular Devices. In Y. Zhang and B. Xu (eds.) Soft Nanomaterials, (World Scientific, 2019), pp. 175–199. 42. H. L. Chen, D. Doty, and D. Soloveichik, Deterministic function computation with chemical reaction networks. Nat. Comput. 13(4), 517–534 (2014). 43. M. Cook, D. Soloveichik, E. Winfree, and J. Bruck, Programmability of chemical reaction networks. In A. Condon, D. Harel, J. N. Kok, A. Salomaa, E. Winfree (eds.) Algorithmic Bioprocesses (Springer, 2009), pp. 543–584. 44. L. Qian, D. Soloveichik, and E. Winfree, Efficient Turing-universal computation with DNA polymers. In International Workshop on DNA-Based Computers (Springer, 2010), pp. 123–140. 45. D. Fu, S. Shah, T. Song, and J. Reif, DNA-based analog computing. In Synthetic Biology (Springer, 2018), pp. 411–417. 46. R. R. Machinek, T. E. Ouldridge, N. E. Haley, J. Bath, and A. J. Turberfield, Programmable energy landscapes for kinetic control of DNA strand displacement. Nat. Commun. 5, 5324 (2014). ˇ 47. N. Srinivas, T. E. Ouldridge, P. Sulc, J. M. Schaeffer, B. Yurke, A. A. Louis, J. P. Doye, and E. Winfree, On the biophysics and kinetics of toeholdmediated DNA strand displacement. Nucleic Acids Res. 41(22), 10641–10658 (2013). 48. B. Wang, C. Thachuk, A. D. Ellington, E. Winfree, and D. Soloveichik, Effective design principles for leakless strand displacement systems. Proc. Nat. Acad. Sci. 115(52), E12182–E12191 (2018). 49. M. Arita, A. Nishikawa, M. Hagiya, K. Komiya, H. Gouzu, and K. Sakamoto, Improving sequence design for DNA computing. In Proceedings of the 2nd Annual Conference on Genetic and Evolutionary Computation (Morgan Kaufmann Publishers Inc., 2000), pp. 875–882. 50. T. Song, N. Gopalkrishnan, A. Eshra, S. Garg, R. Mokhtar, H. Bui, H. Chandran, and J. Reif, Improving the Performance of DNA Strand Displacement Circuits by Shadow Cancellation. ACS Nano 12(11), 11689– 11697 (2018).

page 179

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch06

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235740 0006

Chapter 6

Optics-Free Imaging with DNA Microscopy: An Overview Xin Song∗,†,‡ and John Reif∗,‡ ∗

Department of Electrical and Computer Engineering, Duke University, Durham, NC 27708, USA † Department of Biomedical Engineering, Duke University, Durham, NC 27708, USA ‡ Department of Computer Science, Duke University, Durham, NC 27708, USA

6.1. Introduction The specificity and programmability of Watson–Crick base pairings enable scientists to design networks of synthetic DNA oligonucleotides (oligos) that react in predictable ways to realize systems of DNA-based computation,1 data storage,2 and molecular imaging3 The ability to resolve details at small scales is crucial to the development of many fields such as biological sciences and materials engineering. Conventional microscopy relies on photons (e.g., in optical microscopy), electrons (e.g., in electron microscopy), or scanning probes (e.g., in atomic force microscopy) to directly interrogate the spatial arrangement of surface/volumetric features of a given sample. However, these techniques have various shortcomings such as the diffraction-limited resolution in optical imaging, expensive instrumentation in electron beam imaging, and low throughput in atomic force imaging. Over the past few decades, DNA nanotechnology has contributed a wealth of techniques and tools including super-resolution imaging, which can leverage the stochastic binding

181

page 181

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch06

X. Song and J. Reif

182

kinetics of short dye-labeled DNA probes to visualize fine features on molecular-scale targets such as DNA origami,4 improving the imaging multiplexity and localization precision. Recently, the concept of optics-free imaging 5 has offered an alternative route to use DNA as a molecular-scale imaging medium without a priori spatial addressing and to leverage next-generation sequencing for high-throughput data readout. Such a new modality of DNA-based microscopy completely obviates the use of optics for molecular image acquisition and reconstruction and may offer a scalable and economical solution to address prior limitations faced by conventional microscopy. This chapter provides a short survey of recent achievement in both theoretical and experimental implementations of DNAbased optics-free imaging, which we combinedly refer to as DNA microscopy. As shown in Figure 6.1, the typical pipeline of DNA microscopy involves a few steps including (i) DNA barcoding of molecular targets, (ii) pairwise concatenation of adjacent barcodes, (iii) sequencing of concatenated barcodes, and (iv) image reconstruction based on barcodes proximity information. Unlike conventional imaging systems that generate direct visualizations of a sample in the microscope’s field of view, DNA microscopy investigates the local proximities between individual targets of interest to map out the relative positions among all the targets (e.g., a population of molecules). This set of pairwise adjacencies can be algorithmically analyzed to reconstruct a global image that approximates the original

(a)

(b)

(c)

(d)

(e)

Figure 6.1. Typical pipeline of optics-free DNA microscopy. (a) Original spatial arrangement of molecular targets. (b) Tagging the targets of interest with unique DNA barcodes. (c) Pairwise concatenation of adjacent barcodes. (d) Recovery of concatemers by DNA amplification and sequencing. (e) Reconstruction of a spatial image based on pairwise proximities. The estimated image may not preserve the original scale, rotation, and chirality of sample without additional information.

page 182

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch06

Optics-Free Imaging with DNA Microscopy: An Overview

183

positions of the molecular targets on a 2D surface or in a 3D volume without any a priori knowledge of their spatial localities. 6.2. DNA Microscopy for Surface 2D Imaging For DNA microscopy on flat surfaces (including applications that investigate the spatial arrangement of 3D objects through their projections onto a 2D surface), the process of image reconstruction is analogous to forming a 2D graph, where each node corresponds to an individual target on the sample and each edge connecting two nodes reflects the juxtaposition between a pair of neighboring targets. In other words, the nodes of the graph can be viewed as the pixels of an underlying sample image. The first step to form such a graph is to confer each node with a unique identity. Typically, DNA microscopy achieves this by tagging each target (e.g., molecule) on the sample with a unique DNA barcode (i.e., short random sequence of nucleotides). Next, barcodes of adjacent targets undergo enzymatic reactions to record the pairwise proximity between each two neighboring nodes. In large and complex graphs, each node may be in close proximity to multiple other nodes. Therefore, practical DNA microscopy based on the recovery of local proximities needs to account for this potential “one-to-many” relationship between any given node and its immediately adjacent neighbors. Once an adequate set of pairwise proximities are recovered (typically by DNA amplification and sequencing), edges of the graph can be reconstructed and collectively determine the relative positions of nodes to form a graph that approximates the original spatial distribution of barcoded targets on surface. Because only local proximity information is used to reconstruct the graph, the resulting global image may not well preserve the original scale, rotation, and chirality of the actual sample image without additional information. However, the use of DNA barcodes and massively parallel sequencing gives DNA microscopy a unique advantage in terms of achieving remarkably high multiplexity and throughput for investigating complex and large population of molecular targets.5 Experimentally, the proximity record between two adjacent DNA barcodes may be generated via biochemical colocalization techniques

page 183

August 3, 2021

184

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch06

X. Song and J. Reif

such as DNA proximity ligation6 (facilitated by a connector oligo that binds to two adjacent probes and joins them to form a template for ligation and subsequent PCR amplification) or DNA proximity extension7 (facilitated by hybridization between two adjacent probes with matching annealing sites and then followed by polymerase extension of the hybridizing oligos to form a template for PCR amplification). However, the barcoded probes used in these techniques usually become depleted after recording single proximity pairs and thus do not work well in scenarios where a given node may have more than one neighbor in immediate proximity. To address this challenge, Schaus et al.8 proposed an “auto-cycling proximity recording” mechanism (Figure 6.2(a)) that enables continuous and repeated generations of proximity records between any adjacent pair of DNA-barcoded molecular targets. In their design, each target is tagged with a DNA hairpin probe that has a unique barcode and an open toehold (i.e., single-stranded overhang). DNA hairpin is a secondary structure of single-stranded DNA when it folds on itself and forms an unpaired loop with a duplex stem region. In this case, the toehold forms a sticky end that can hybridize to a primer. With strand-displacing polymerase and the energy from dNTPs, the double-stranded hairpin stem can be primed partially open until the polymerase encounters a stopper before the hairpin loop. The hairpin sequence is carefully designed to bias toward the reformation of a closed hairpin — a strategy that facilitates the newly synthesized strand (referred to as a “half-record”) to be isothermally displaced by DNA strand displacement (a hybridization-based process in which an incumbent strand is displaced by an invading strand via branch migration) which spontaneously reforms the hairpin stem. This single-stranded half-record contains the identity of the hairpin and exposes a palindromic domain that can hybridize to the palindromic domain of a nearby half-record produced by an adjacent probe. The hybridized palindromic domains are then extended by polymerase, resulting in the release of a double-stranded “full-record” that incorporates both barcodes from the two adjacent probes. The release of the “full-record” in turn regenerates the involved pair of probes to their initial states so that both of them can continue

page 184

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch06

Optics-Free Imaging with DNA Microscopy: An Overview

185

the cycles of transient binding with any adjacent probes. Such an isothermal “copy and release” mechanism enables a probe to produce multiple proximity records with any immediately adjacent probes in a nondestructive, autonomous, and continuous manner. The authors compared different probe lengths and attachment chemistries on DNA origami to characterize the generation rate of proximity records, which was found to be nonlinearly related to the probe– probe distance. In addition to experimentally interrogating complex nanoscale geometries such as a seven-probe geometry on DNA origami, the authors demonstrated their nanoscope’s reusability by repeatedly sampling the same probe undergoing different state changes. This work reports that each probe can generate ∼30 records on average, offering high sensitivity for potentially recording a complete set of local proximities needed for intact global image reconstruction.8 To allow the generation of more than one proximity record per probe, Boulgakov et al.9 extended the conventional technique of proximity ligation to implement iterative cycles of ligation, cleavage, and re-ligation between adjacent pair of DNA probes (Figure 6.2(b)). Each probe contains a primer site, a unique barcode sequence, and a half restriction site. With the presence of a short bridging oligo

(a)

(b)

Figure 6.2. Mechanisms for generating multiple proximity records between adjacent barcoded molecules. (a) Auto-cycling proximity recording (adapted from Ref. [8] with permission). (b) Iterative reversible proximity ligation (adapted from Ref. [9] with permission).

page 185

August 3, 2021

186

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch06

X. Song and J. Reif

complementary to the restriction half-sites, two adjacent probes can be ligated to form a concatenated record, which is subsequently duplicated by primer extension and read out by real-time polymerase chain reaction (qPCR). The sequences of probes are designed such that a restriction cut can be made at the ligation site to revert the probes to their original unligated state. This enables each probe to participate in further cycles of reversible ligations to record its adjacency to different probes located in close proximity. The overall technique is conceptually simple however it requires that two neighboring DNA probes must have opposite polarities in order to form a pairwise concatenated record. Such a criterion may limit the achievable robustness and resolution of image acquisition and reconstruction. As a proof of concept, the authors tested the feasibility of reversible ligations using probes attached to magnetic beads. Although the technique was not yet experimentally demonstrated to interrogate complex spatial colocalizations involving multiple targets, this work proposed the application of graph theory to interpret the outputs from iterative proximity ligations and subsequently recover complex 2D geometries using spring layout algorithms. According to simulations, the algorithms were able to reconstruct various 2D topologies with relatively small errors despite just a few rounds of iterative ligations and low ligation efficiencies.9 Recently, Hoffecker et al.10 developed a computational framework that also explores the use of graph theory to facilitate optics-free image reconstruction for DNA microscopy. Their idea is to convert the surface of interest into a tessellation of barcoded patches and then form an untethered graph by probing the pairwise adjacencies between patches with shared borders. The tessellated surface offers several characteristic properties that can be leveraged mathematically to reconstruct planar embeddings of the untethered graph. Different algorithms may be used to compute a proper embedding that approximates the original Euclidean spatial information of the underlying image. In order to form such a tessellated surface, the authors propose to randomly seed barcoded DNA probes on a surface full of primers, and then each seeded probe is locally amplified to form a colony of probes bearing the same barcode from the

page 186

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch06

Optics-Free Imaging with DNA Microscopy: An Overview

187

seed probe. As the colonies expand and saturate (i.e., when adjacent colonies share boundaries), a Voronoi tessellation is formed with each patch uniquely barcoded. After that, the pairwise proximities between immediately adjacent colonies can be biochemically recorded and read out by sequencing to facilitate subsequent graph recovery and image reconstruction. According to simulations, higher colony density (i.e., average number of colonies per unit area) reduces the average distortion and improves the resolution of image reconstruction, whereas variations in site density (i.e., average number of duplicated probes within a colony) has a minimal effect on distortion. It is worth noting that each colony ultimately represents a single pixel on the estimated image, as a result, self-pairing events within individual colonies does not provide additional information for graph reconstruction and may be avoided via techniques such as bipartite networks or series colony generation.10

6.3. DNA Microscopy for Volumetric 3D Imaging As image reconstruction in DNA microscopy depends primarily on investigating local proximities, the technique can be applied to reconstruct a sample’s spatial information post hoc even if the sample’s original spatial arrangement is scrambled or lost. Glaser et al.11 views the basic principle behind DNA microscopy as analogous to putting together a large puzzle based on only the relative positions between adjacent puzzle pieces (Figure 6.3(a)). Specifically, the set of relative positions can be collectively represented by an N × N similarity matrix for a puzzle containing N pieces, where each entry in the matrix describes the proximity between two given puzzle pieces. Then the goal of image reconstruction can be viewed as mapping such a high-dimensional matrix to a lower-dimensional image that reflects the puzzle pieces’ original spatial coordinates (typically in 2D or 3D). To facilitate high-resolution image reconstruction for large-scale “puzzle imaging”, the authors developed two dimensionality reduction algorithms — sparse diffusion maps and unweighted landmark isomap — and evaluated the algorithms’ performance in three hypothetical applications, including (i) neural

page 187

August 3, 2021

188

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch06

X. Song and J. Reif

(a)

(b)

Figure 6.3. Example implementations and applications of DNA microscopy. (a) Puzzle voxel imaging for 3D mapping of neurons in tissue sample (adapted from Ref. [11] with permission). (b) Spatio-genetic imaging of transcripts for de novo reconstruction of multicellular ensemble (adapted from Ref. [12] with permission, Copyright 2019, Elsevier).

voxel puzzling, (ii) neural connectomics puzzling, and (iii) chemical puzzling. The first application explores 3D mapping of neurons in a tissue sample by labeling each neuron with a unique barcode and then shattering the tissue sample into numerous tiny voxels. Each voxel may contain several barcoded neurons whose identities can be inferred by sequencing. Based on the fact that adjacent voxels are likely to share more common neurons, a similarity matrix can be constructed and analyzed to recover the relative coordinates of each voxel’s placement in the original 3D space. According to simulated data, both algorithms reconstructed the image faithfully with impressive robustness against missing voxels. In the second

page 188

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch06

Optics-Free Imaging with DNA Microscopy: An Overview

189

application, information on neural connections is used to recover the spatial locations of individual neurons within a 3D specimen. First, neurons forming synaptic connections link their DNA barcodes to form pairwise proximity records. Next, the algorithm takes these proximity records as input to generate a connectivity matrix that can be treated as the similarity matrix to puzzle back the individual neurons into their respective positions in the original 3D space. For this particular application, it is worth noting that neurons may form both short-range and long-range connections, therefore, algorithms such as the unweighted landmark isomap that expects only shortrange connections may lead to incorrect image reconstructions. In the third application, puzzle imaging was used to map out complex gradients of chemical concentrations by recovering the locations of chemical-sensing bacterial cells growing on a surface. Recovery of spatial localities relies on the recording of pairwise adjacencies between neighboring bacterial colonies via conjugative transfer of plasmids between cells. Overall, this work provides thorough evaluations of the proposed algorithms and offers important insights on the computational complexities and image reconstruction accuracies of DNA microscopy in different use cases, which may be generalizable to various other applications. Weinstein and coauthors12 recently developed an experimental platform for spatio-genetic DNA microscopy based on amplicon diffusion dynamics (Figure 6.3(b)). In their system, transcript molecules within a biological specimen are tagged with unique DNA barcodes, and then each barcoded molecule is amplified in situ to form a molecular diffusion cloud. As the PCR products slowly diffuse across the cell, adjacent diffusion clouds start to encounter and overlap at various degrees. During this process, amplicons located within the overlapping clouds can concatenate with one another via overlapextension PCR reactions. The pairwise concatenation reaction is designed to not allow self-reactions among amplicons that have identical barcodes. As a result, each pairwise concatemer records the identities of two adjoining amplicons, the biological sequences of the two associated transcripts, as well a unique identifier generated for the single concatenation event. Because each concatemer is

page 189

August 3, 2021

190

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch06

X. Song and J. Reif

uniquely labeled by an event identifier, one could count the total number of concatenation events that have occurred between two overlapping clouds. From a probabilistic point of view, this number gives an estimate of the degree of the cloud overlap and in turn reflects the original spatial proximity between the corresponding barcoded transcripts from which the clouds were formed initially. By inferring the counts of concatenations between different pairs of diffusion clouds in a matrix form, one could apply matrix algebra and spectral graph theory to decode the information into a highresolution image that visualizes the original dimensionality and the spatial arrangement of target molecules in the specimen. Using the protocol, the authors experimentally demonstrated large-scale opticsfree imaging of transcripts in mixed cell populations and preserved remarkable cellular resolution without systematic distortions.

6.4. Conclusions and Outlook DNA microscopy enables an exciting new form of molecular imaging that completely sidesteps the use of optics from initial image acquisition to final image reconstruction and visualization. Taking advantage of DNA barcoding and next-generation sequencing, recent works have demonstrated the technology’s remarkable potential for achieving high resolution, multiplexity, and scalability without specialized equipment as required by conventional imaging platforms. As a nascent technology, DNA microscopy still faces various limits and challenges before it can be standardized and widely adopted. In particular, the reliance on local proximities to estimate global patterns makes it difficult to reliably recover images that have empty spaces or disjoint features. To address challenges like this, improvements may be needed for both the biochemistries (e.g., enhancing the efficiency and yield of proximity recording) and the image reconstruction algorithms (e.g., reducing the computation complexity and increasing the robustness against noisy and missing data). With continuing development and refinement, DNA microscopy may become a leading implementation for optics-free molecular imaging that is both robust and cost-effective.

page 190

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch06

Optics-Free Imaging with DNA Microscopy: An Overview

191

Acknowledgments This work was sponsored by NSF grant no. CCF 1813805 and CCF 1909848. References 1. F. Wang, H. Lv, Q. Li, J. Li, X. Zhang, J. Shi, L. Wang, and C. Fan, Implementing digital computing with DNA-based switching circuits. Nat. Commun. 11, 121 (2020). 2. X. Song and J. Reif, Nucleic acid databases and molecular-scale computing. ACS Nano 13, 6256–6268 (2019). 3. R. Jungmann, C. Steinhauer, M. Scheible, A. Kuzyk, P. Tinnefeld, and F. C. Simmel, Single-molecule kinetics and super-resolution microscopy by fluorescence imaging of transient binding on DNA origami. Nano Lett. 10, 4756–4761 (2010). 4. J. Schnitzbauer, M. T. Strauss, T. Schlichthaerle, F. Schueder, and R. Jungmann, Super-resolution microscopy with DNA-PAINT. Nat. Protoc. 12, 1198–1228 (2017). 5. A. A. Boulgakov, A. D. Ellington, and E. M. Marcotte, Bringing microscopyby-sequencing into view. Trends Biotechnol. 38, 154–162 (2020). 6. I. Weibrecht, K.-J. Leuchowius, C.-M. Clausson, T. Conze, M. Jarvius, W. M. Howell, M. Kamali-Moghaddam, and O. S¨ oderberg, Proximity ligation assays: A recent addition to the proteomics toolbox. Expert Rev. Proteomics 7, 401–409 (2010). 7. M. Lundberg, A. Eriksson, B. Tran, E. Assarsson, and S. Fredriksson, Homogeneous Antibody-based proximity extension assays provide sensitive and specific detection of low-abundant proteins in human blood. Nucleic Acids Res. 39, e102–e102 (2011). 8. T. E. Schaus, S. Woo, F. Xuan, X. Chen, and P. Yin, A DNA nanoscope via auto-cycling proximity recording. Nat. Commun. 8, 696 (2017). 9. A. A. Boulgakov, E. Xiong, S. Bhadra, A. D. Ellington, and E. M. Marcotte, From space to sequence and back again: Iterative DNA proximity ligation and its applications to DNA-based imaging. bioRxiv (2018), 470211. 10. I. T. Hoffecker, Y. Yang, G. Bernardinelli, P. Orponen, and B. H¨ ogberg, A computational framework for DNA sequencing microscopy. Proc. Natl. Acad. Sci. 116, 19282–19287 (2019). 11. J. I. Glaser, B. M. Zamft, G. M. Church, and K. P. Kording, Puzzle imaging: Using large-scale dimensionality reduction algorithms for localization. PLoS One 10, e0131593 (2015). 12. J. A. Weinstein, A. Regev, and F. Zhang, DNA microscopy: Optics-free spatio-genetic imaging by a stand-alone chemical reaction. Cell 178, 229– 241.e16 (2019).

page 191

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 4, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235740 0007

Chapter 7

Fully Analog Memristive Circuits for Optimization Tasks: A Comparison F. C. Sheldon∗,† , F. Caravelli∗ and C. Coffrin‡ ∗

Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA † T-Division (T4), Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA ‡ A-Division (A1), Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA We introduce a Lyapunov function for the dynamics of memristive circuits, and compare the effectiveness of memristors in minimizing the function to widely used optimization software. We study in particular three classes of problems which can be directly embedded in a circuit topology, and show that memristors effectively attempt at (quickly) extremizing these functionals.

7.1. Introduction As the challenges of scaling traditional transistor-based computational hardware continue to intensify, “Moore’s Law,” governing the exponential increase of transistor density, is coming to an end. While the first computers were analog,1 in the past decades digital computing has made incredible progress and our laptops are now more powerful than the supercomputers just 30 years ago. On the other hand, there remain hard computational problems that still challenge computer scientists and modern digital computers; in particular many optimization problems. Recently, interest has grown

193

page 193

August 4, 2021

194

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

F. C. Sheldon, F. Caravelli and C. Coffrin

in embedding algorithms directly in analog hardware in the hope that the corresponding hardware speedup could yield a useful specialized processor. In this chapter we focus on the application of analog nanoscale electronic devices with memory, more specifically memristors. Proposals for specialized co-processors formed of memristors show extreme breadth and versatility in computing applications,1–8 ranging from optimization to artificial neural networks. Here we focus on understanding how the native dynamics of memristive circuits encode features of optimization problems. Memristors are two-terminal devices that display pinched (at the origin) hysteretic behavior in their voltage–current diagram. Physical memristors1, 3, 5 have rather non-trivial voltage–current curves, but many core features are captured by a simple description which we adopt in this paper. In this model, the state of the resistance varies between two limiting values and can be described by a parameter w which depends on the previous history of the device dynamics and thus may be interpreted as a memory. We will refer to w as the internal memory parameter. In spirit, memristors have the essential property that the underlying dynamics are the result of competition between resistance reinforcement, caused by the flow of currents through the device, and a thermodynamically driven decay.9, 10 Recent advancements show that there is a deep connection between the asymptotic memory states of the circuits and the solutions of combinatorial optimization, and the ground states of the Ising model and spin glasses. Additionally, memristors offer a possible substrate to construct neuromorphic chips, e.g. electronic components that behave similarly to human neuronal cells. Central to all of these applications is that memristors, as we show in this paper, can perform computation without requiring CMOS, thus in a fully analog fashion. As a result, circuits of memristors have been proposed as a potential basis for the next generation of passive and low-energy computational architectures. Interest in specialized analog co-processors for solving optimization problems has generated a host of possible approaches. While some of these problems can be in principle be tackled using quantum computers,11, 12 it is unlikely that these will be available for mass

page 194

August 4, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

Fully Analog Memristive Circuits for Optimization Tasks: A Comparison 195

distribution. One of the proposed alternative paradigms is in-memory computation:13 removing the separation between memory and computing typical of the von Neumann architecture. In this approach specialized circuits are designed to utilize active components in concert with memristors to obtain the solution of a specific problem.14, 15 In this work, we consider a more fundamental question: do the dynamics of circuits of memristors encode optimization problems natively? Understanding their asymptotic behavior requires characterizing the interplay between nonlinear dynamics, interactions and constraints and as a result the dynamics of memristor networks is still an area of active research, despite the the fact that the theory behind a single device was introduced over half a century ago.2, 16 With this purpose in mind, in this chapter we study a specific optimization problem in the context of fully analog memristive circuits, for example, circuits composed only of memristors. For these circuits we can take advantage of an exact evolution equation for the internal memory parameters, which will serve as our case study. For these equations, we derive a novel Lyapunov function (and which solves some of the problems of a Lyapunov function provided in the literature). Being the Lyapunov function being minimized by the memristive network, we compare the results of the minimization to state of the art optimization software. 7.2. Dynamical Equation for Memristor Circuits 7.2.1. Single memristor and Lyapunov function For the case of titanium dioxide devices, a rather simple toy model for the evolution of the resistance is the following: R(w) = Ron (1 − w) + wRoff ≡ Ron (1 + ξw), Ron d w(t) = αw(t) − i(t), dt β

(7.1)

on ; in initially studied for α = 0, and where 0 ≤ w ≤ 1, ξ = RoffR−R on the equation above i(t) is the current flowing in the device at time t. Physically, w can be interpreted as the level of internal doping of

page 195

August 4, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

F. C. Sheldon, F. Caravelli and C. Coffrin

196

the device, but this is a crude description. The constants α, β and ξ control the decay and reinforcement timescales and the degree of nonlinearity in the equation respectively, and can be measured experimentally. While ξ is a dimensional and depends only on the resistance boundaries, α has the dimension of an inverse time, while β has the dimension of time divided by voltage. Aside from applications to memory devices, there is interest in these components also because memristors can serve as memory for neuromorphic computing devices.17 We first demonstrate that this equation possesses a Lyapunov function that governs it’s asymptotic behavior. In order to understand the Lyapunov function of the full network, we begin with the case of a single memristor driven by a voltage generator V (t). From the equations above, we have Ron V (t) d , w(t) = αw(t) − dt β Ron 1 + ξw(t)

(7.2)

from which we obtain

d 1 w(t) = α 1 + ξw(t) w(t) − V (t) 1 + ξw(t) dt β V (t) 2 . = α w(t) + ξw(t) − αβ

(7.3)

Let us define now L(w) = a w(t)2 + b w(t)3 + c w(t)V (t).

(7.4)

We have dV dw d L(w) = (2a w(t) + 3b w(t)2 + c V (t)) + cw(t) . dt dt dt

(7.5)

Now assume that V (t) = V0 . If we choose 1 a=− , 2

1 b = − ξ, 3

c=

1 αβ

(7.6)

page 196

August 4, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

Fully Analog Memristive Circuits for Optimization Tasks: A Comparison 197

Thus

1 −w(t) − ξw(t) + V0 αβ 2 dw . = −α dt

d L(w) = dt

2

dw dt (7.7)

Thus if α > 0 dL ≤0 dt

if

dw = 0, dt

(7.8)

with L(w) =

1 1 V0 w(t) − w(t)2 − ξw(t)3 . αβ 2 3

(7.9)

√

0 t−1 and thus Now, for α = 0 the solution is of the form w(t) = 1+qV c d dt w = 0 can be only satisfied only for w = 1 or w = 0. For α = 0 there is no explicit analytical solution but it can be expressed in the form V0 s= β

q(t) = c0 − t

f (t) =

log αξq(t)2 + αq(t) + s + 2α

tan−1

√

α(2ξq(t) + 1) √ 4ξs − α √ √ α 4ξs − α

w(t) = f −1 (t) 1 ≥ w(t) ≥ 0,

(7.10)

whose analysis goes beyond the scope of this chapter. However, a way to see that the system must eventually reach one of the boundary points w = {1, 0}, is the fact that there is fixed point for the dynamics, which is defined by the equation w∗ (1 + ξw∗ ) =

V0 . αβ

(7.11)

However, the analysis of the stability of the fixed point reveals that this is an unstable fixed point. From this fact we can intuitively

page 197

August 4, 2021

198

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

F. C. Sheldon, F. Caravelli and C. Coffrin

understand that if w(0) > w∗ , necessarily we have w(∞) = 1, and while if w(0) < w∗ we obtain w(∞) = 0. A similar analysis applies to the case of a network of connected memristors, as we will see shortly. Given the fact that w(∞) ∈ {1, 0}, we have wn (∞) = w(∞) and we can simplify the asymptotic form of the Lyapunov function to V0 w∞ − αβ 1 V0 − = αβ 2

L(w∞ ) =

1 2 1 3 w∞ − ξw∞ 2 3 1 − ξ w∞ . 3

This function has asymptotic values 1 1 V0 − − ξ, 0 = αβ 2 3 1 1 ∗ ∗ = w (1 + ξw ) − − ξ, 0 . 2 3

(7.12)

The dynamics of a memristor are thus connected to an optimization problem of the form, 1 1 1 1 V0 V0 ∗ ∗ − − ξ, 0 , or L = min − ξ, , L = min αβ 2 3 αβ 3 2 (7.13) however we have no guarantee that the dynamics will “pick” the correct minimum of the Lyapunov function and from our analysis above, we see that this should be depend on the initial conditions. It is easy to perform simulations of the system above. For instance, we find that for α = 0.1, β = ξ = 10, and V = 0.92, the system ends in the real minimum of the asymptotic function 70% of the time, yet still the system can have a macroscopic portion of asymptotic states not in the minimum of the Lyapunov “energy”. This fact shows that while the Lyapunov function is being minimized along the dynamics of the memristors, the system can effectively be trapped in local minima. This is why we focus on the minimization of a continuous Lyapunov function for which we can compare the observed asymptotic states from the memristor dynamics to minima obtained via state of the art optimization software.

page 198

August 4, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

Fully Analog Memristive Circuits for Optimization Tasks: A Comparison 199

7.2.2. Circuits We now with to extend the analysis we did for a single memristor to a circuit. We consider a graph in which each edge contains a memristor and voltage generator in series. The state of the internal memory parameters is thus a vector w in which each entry corresponds to an edge and each are driven by voltage generators s(t). Memristors in the graph will now interact due to shared currents at the nodes/electrical junctions of the graph. The extension of Eq. (7.1) to a circuit can be done, and is given by 1 d w(t) = αw(t) − (I + ξΩW (t))−1 Ωs(t), dt β

(7.14)

with the constraints 0 ≤ wi ≤ 1 and where we use the convention that W (t) = diag(w(t)) is a diagonal matrix containing the internal 18, 19 The projection operator Ωij contains the memory parameters. information about the topology of the graph and can be thought of as picking out configurations consistent with Kirchoff’s voltage law. As we will discuss shortly, components of Ωij may also be considered as the interaction strength between memristors in the graph. We note that because Ω is a projection operator, Ω = Ω2 we can always write s = Ωs + (I − Ω)s, it is straightforward to show that we can add to s any vector s˜ = (I − Ω)k, which will not affect the dynamics. This form of freedom arises from the Kirchhoff constraints from which the differential equation has been derived. The set of coupled differential equations above incorporate all dynamical and topological constraints of the circuit exactly.18, 19 Kirchoff’s Laws manifest themselves via the projection operator Ω which intervenes in the dynamics. Such projector operator also emerges for purely resistive circuits with edges of the graph containing voltage generators Si in series to resistors ri . For the case of constant resistance ri = r, the equilibrium currents can be written in a vectorial form as i(t) = − 1 ΩS(t), r

(7.15)

page 199

August 4, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

F. C. Sheldon, F. Caravelli and C. Coffrin

200

where Ω = At (AAt )−1 A is a non-orthogonal projector on the cycle space of the graph. The matrix A has the dimension Cycles × Edges of the graph (each row designates a fundamental cycle of the graph), and thus Ω has the correct dimension (e.g. the number of memristors).18, 20 We can also generalize Equation (7.14) to various forms of driving including current generators in parallel with memristors or current/voltage generators driving the nodes of the circuit. We cast these in a general form using a generic source vector x as, 1 d w(t) = αw(t) − (I + ξΩA W (t))−1 x, dt β

(7.16)

where we have

x =

⎧ ⎪ ΩAs ⎪ ⎪ ⎪ ⎪ ⎪ ⎨A(AT A)−1sext ⎪ ⎪ ΩBj ⎪ ⎪ ⎪ ⎪ ⎩ T B (BB T )−1jext

Voltage sources in series Voltage sources at nodes Current sources in parallel Current sources at nodes.

The purpose of this chapter is to further understanding of the asymptotic dynamics of a circuit of memristors, and specifically the statistics of the resistive states. An analysis of the asymptotic states can be done via Lyapunov functions as we did for the case of a single memristor. After a first attempt at deriving a Lyapunov function,21 plagued by constraints on the external fields, here we provide a novel yet similar Lyapunov function free of these requirements. From the point of view of optimization with analog dynamical systems, different Lyapunov functions provide different ways of embedding a computational problem in a physical system. We follow the same prescription as the single memristor case, but where the interaction matrix is a projection operator on the cycle basis of the circuit.

page 200

August 4, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

Fully Analog Memristive Circuits for Optimization Tasks: A Comparison 201

7.2.3. Lyapunov function for memristor circuits We begin with the equations of motion, 1 (I + ξΩW )w ˙ = αw + αξΩW w − x β

(7.17)

where we have multiplied by (I + ξΩW ). Consider αξ T 1 T α T Ww w W ΩW w w W x. − + L=− w 3 4 2β In this case, we have dL =w ˙ T dt

1 −αW w − αξW ΩW w + W x β

(7.18)

˙ = −w ˙ T (W + ξW ΩW )w √ √ √ √ ˙ = −w ˙ T W (I + ξ W Ω W ) W w √ ˙ 2(I+ξ √W Ω√W ) (7.19) = −|| W w|| √ √ and we have that dL dt ≤ 0 as (I + ξ W Ω W ) is positive definite. We thus have that L in Equation (7.29) is a Lyapunov function for a circuit of memristors. An asymptotic form can be obtained by replacing wik = wi for integer k, as asymptotically one has wi = {1, 0}. Thus, the asymptotic Lyapunov function form is given by α 1 αξ T T Ωw x − +w L(w) =− w 4 2β 3 which is a form familiar from physics in the context of spin systems. We can re-express this in terms of spin variables σi = 2wi − 1 and with a few simplifications as ˜ = 8L(σ ) = σ · L α

2x 4 ξ ˜ − 1 − ξΩ1 − σ Ω σ αβ 3 2

(7.20)

˜ has only the off-diagonal terms of Ω. The structure of the where Ω Lyapunov function above is very similar to the one described before,

page 201

August 4, 2021

202

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

F. C. Sheldon, F. Caravelli and C. Coffrin

√ √ but only contains the spectral condition I +ξ W Ω W ≥ 0, which is 2 x − 43 1−ξΩ1 natural. We can thus identify an effective local field h = αβ ˜ and interactions between memristors given by Ω. A notable omission from the Lyapunov function argument above is the presence of boundaries on the internal memory parameters wi . As individual memristors reach their boundaries and their dynamics halted, components of the derivative in Equation (7.19) go to 0. As a test of the fact that the Lyapunov function above works when including boundary effects, in Figure 7.1 we plot dL dt evaluated numerically for 100 instances (Ω, h), in which Ω was obtained from random circuits and h is a gaussian-distributed vector. We now wish to show that the Lyapunov function converges asymptotically only on the boundary of the set [0, 1]N , which is what one observes numerically. 7.2.4. Number of fixed points and stability As for the case of the one dimensional model, the fixed points of the dynamics are important in order to understand the stability of the system. In the previous section we have assumed that our Lyapunov

Figure 7.1. Derivative of the Lyapunov function of Eqn. (7.29) for a 100 random initial conditions and instances (Ω, h). We see that the derivative is always negative, and thus L is decreasing.

page 202

August 4, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

Fully Analog Memristive Circuits for Optimization Tasks: A Comparison 203

function can be replaced with an asymptotic form which is on the binary set wi = {0, 1}. We wish to show this feature in this section. The fixed points are determined via w ∗ = (I + ξΩW ∗ )−1

s . αβ

(7.21)

where w ∗ is a fixed point. Then, Let us assume that w =w ∗ +δw, we have d δw = ∂w f(w ∗ )δw. dt For memristors one has22 fi (w) = αwi −

(7.22)

(I + ξΩW )−1 ik (Ωs)k ,

(7.23)

k

from which, if we use ∂x A−1 = −A−1 (∂x A)A−1 1 ∂wj fi = αδij + ξ (I + ξΩW )−1 ik Ωkr (∂wj W )rt β krts

× (I + ξΩW )−1 ts (Ωs)s .

(7.24)

Evaluating this at the fixed point, we have = (I + ξΩW )−1 Ωs αβ W from which

(I + ξΩW )−1 Jij = ∂wj fi = α δij + ξ ik Ωkj Wj

(7.25)

k

= α(δij + ξ(I + ξΩW Ω)−1 ij Wj )

(7.26)

where the last line can be derived from the Neumann representation of the inverse and the projection condition. We now aim to prove that Jij 0 which, as α > 0 and ξ > 0, will follow from (I + ξΩW Ω)−1 ij Wj 0. −1 which Now we have that √ any matrix A, A ∼ P AP , from √ for W we obtain AD ∼ DA D for D 0. Thus, (I + ξΩW Ω)−1 j ∼ ij √ −1 Wi (I + ξΩW Ω)ij Wj . This matrix is clearly positive as it is

page 203

August 4, 2021

204

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

F. C. Sheldon, F. Caravelli and C. Coffrin

symmetric and (I + ξΩW Ω)−1 ij is positive because ΩW Ω is positive. This implies that Jij 0 and any fixed point of the equation will be unstable. Of course, there might a possibility that one might start from an initial condition which is a fixed point of the dynamics. Let us thus discuss how difficult it is to initialize the system on the fixed point manifold. Let Σ be the manifold of the fixed points. Then, the probability that with a random initial condition will be the ratio of the cardinalities of the two sets, the fixed point manifold and C(Σ) and C([0, 1]N ). We thus ask ourselves what is C(Σ). We can write the fixed point equation without loss of generality as w + ξΩw 2 =

s ≡ b, αβ

(7.27)

where (w 2 )i = wi2 . The equation above can be written as a set of N constraints of the form wj2 = 0 (7.28) wi + ξΩii wi2 − bi + ξ j=i

which defines the set of N intersecting quadrics. The intersection of these quadrics defines an algebraic variety of degree 2. According to Bézout theorem,23 for a system of well behaved polynomial equations (N equations with N variables) of degrees d we have at most dN solutions, which is exactly 2N in our case. However 2N discrete points are a set of measure zero in [0, 1]N . Naturally, this implies that if one initializes the memristors at a random initial condition in wi (0) ∈ [0, 1], the system is very unlikely to initialize on the fixed point manifold, and thus via the unstable dynamics it must reach the boundary of the convex set [0, 1]N , e.g. {0, 1}N . 7.3. Analysis and Comparisons In this section we provide evidence of the capability of memristors to significantly lower the energy as measured by the Lyapunov function. While we have demonstrated a particular form of optimization problem that is ‘native’ to circuits of memristors, it is common across analog systems that embedding an arbitrary problem into this

page 204

August 4, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

Fully Analog Memristive Circuits for Optimization Tasks: A Comparison 205

form is difficult.. For this reason we focus on problem instances that are directly embeddable in memristor circuits; i.e. that arise from different circuit structures. 7.3.1. The instances To generate instances native to memristor circuits, we formalize the optimization algorithm as a map from a circuit graph G to a projection operator Ω(G). This becomes the coupling matrix of our objective function. The underlying graphs G we chose are an Erdos-Renyi random graph (ER), a 2-dimensional lattice (Lattice2d) and a 3 dimensional lattice (Lattice3d). Given these graphs, we then obtain the projection operator Ωij (G) = At (AAt )−1 A (which is a dense matrix), which is based on the cycle space of the graph.20 A graphical representation of the underlying circuit is shown in Figure 7.2. 7.3.2. Minimization of the continuous Lyapunov function We compare the result of the minimization of the function αξ T 1 T α T Ww w W ΩW w w W x. − + L=− w 3 4 2β

(7.29)

Figure 7.2. The three circuit instances we consider. We have an Erdos–Renyi underlying circuit (left), a 2D lattice (center) and a 3D lattice (right). Given these, we then build the cycle matrix of the circuit A and calculate the projection operator Ω = At (AAt )−1 A, which is a dense matrix, and enters in the Lyapunov function of Equatin (7.29).

page 205

August 4, 2021

206

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

F. C. Sheldon, F. Caravelli and C. Coffrin

using memristive circuits to other optimization algorithms. Specifically, we compare the memristive algorithm in which the dynamical Equation (7.14) is evolved numerically until it reaches a steady state, to an interior point nonlinear optimization algorithm. As a solver, we use the Ipopt,24 an open source (second order) software for large-scale nonlinear optimization. Specifically, the software is state of the art for nonlinear problems of the form minw∈Rd f (w),

(7.30)

s.t. g L ≤ g(w) ≤ gU ,

(7.31)

wL ≤ w ≤ wU ,

(7.32)

where f (w) is the function of interest (in our case Equation (7.29)), wL and wU are 0 and 1, respectively, in this work, and where we introduce no g(w) function constraints in the optimization. The results between the two algorithms for 15 specific instances are shown in Table 7.1, for the case of the ER circuits and lattices of 2- and 3-dimensions. The number of variables we consider is fairly large, for example, in the range N ∈ [112, 300]. Recognizing that both of these algorithms are sensitive to their initial starting conditions, for comparison, we consider 128 i.i.d. executions of each algorithm starting from random initial conditions in the interval [0, 1]N and measure the distributions of runtime and solution quality. In the interest of breadth, a first-order optimization algorithm based on gradient descent, and a random assignment algorithm, for example, we generate random values between [0, 1]N , are also included in the comparison. The results are shown in Figures 7.3–7.5, which compare optimization via memristor networks (mem, light blue), random assignment (rand, brown), Ipopt (nlp, purple) and gradient decent (grad, red). First and foremost, we note that overall Ipopt yields the best solution quality among the optimization algorithms we considered for each specific instance. In Figures 7.3–7.5 we plot examples of distribution of energy states for the ER, Lattice2d and Lattice3d cases. We see that gradient descent and Ipopt are typically close to each other for these cases, and in particular in the ER case the

page 206

249 242 272 220 262 112 112 112 112 112 300 300 300 300 300

62001 58564 73441 48400 68644 12544 12544 12544 12544 12544 90000 90000 90000 90000 90000

nlp-en

grad-en

−601.07 −576.72 −671.67 −520.03 −650.62 −73.80 −78.72 −71.86 −73.6 −71.99 −158.71 −169.60 −161.88 −164.15 −169.36

−592.09 −572.52 −662.14 −511.88 −640.00 −70.98 −71.99 −68.00 −72.33 −70.17 −159.01 −164.01 −162.91 −167.83 −166.25

mem-en −577.36 −561.44 −654.09 −498.89 −630.79 −43.82 −41.50 −44.69 −42.98 −45.62 −110.36 −132.64 −113.74 −116.01 −115.71

rand-en −153.90 −151.70 −173.22 −127.31 −168.76 −8.86 −9.42 −3.03 −9.30 −8.63 0.00 −9.11 −1.59 −6.74 −9.98

nlp-tm

grad-tm

mem-tm

rand-tm

118.98 103.21 159.37 76.98 139.56 3.96 3.99 4.10 4.12 3.99 164.26 150.42 140.47 145.19 151.80

0.29 0.21 0.29 0.20 0.29 0.04 0.04 0.06 0.04 0.06 0.40 0.38 0.41 0.31 0.30

1.37 2.73 3.10 1.71 1.10 0.48 0.44 0.68 0.87 0.75 1.83 2.52 4.21 3.63 3.33

10−3 10−3 10−3 10−3 10−3 10−3 10−3 10−3 10−3 10−3 10−3 10−3 2·10−3 10−3 10−3

b4205-v2-ch07

Notes: The number of nodes of the circuit are given in nodes column (n), while the number of edges of the graph (the variables) in the edge column (e). The results of the optimization using Ipopt, gradient descent, memristors and random are in the columns (nlp,grad,mem,rand )-en columns respectively, while the average time (in seconds) for the solution to be obtained in the (nlp,grad,mem,rand )-tm columns.

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

1 2 3 4 5 1 2 3 4 5

e

16:43

ER 1 ER 2 ER 3 ER 4 ER 5 Latt2d Latt2d Latt2d Latt2d Latt2d Latt3d Latt3d Latt3d Latt3d Latt3d

n

August 4, 2021

instance

Fully Analog Memristive Circuits for Optimization Tasks: A Comparison 207

Table 7.1. Result of the optimization of each instance (1–5) of of the three classes considered in this chapter, the Erdos-Renyi (ER-random), and the Lattice2d and Lattice3d Classes.

page 207

August 4, 2021

208

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

F. C. Sheldon, F. Caravelli and C. Coffrin

Figure 7.3. Distribution of the minima obtained with random sampling (rand), memristors (mem), gradient descent (grad) and Ipopt (nlp) for the Erdos–Renyi class (Instance 1). We see that the distribution of minima for memristors are rather close to the NLP and Grad results.

Figure 7.4. Distribution of the minima obtained with random sampling (rand), memristors (mem), gradient descent (grad) and Ipopt (nlp) for the Lattice2d class (Instance 1). We see that the system absolute minimum is close to the tail of the nonlinear programming optimization code (Ipopt), while on average these are half way between the random and nlp results.

page 208

August 4, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

Fully Analog Memristive Circuits for Optimization Tasks: A Comparison 209

Figure 7.5. Distribution of the minima obtained with random sampling (rand), memristors (mem), gradient descent (grad) and Ipopt (nlp) for the Lattice3d class (Instance 1). We see that the system absolute minimum is close to the tail of the nonlinear programming optimization code (Ipopt).

memristive optimization is also close to the best known solutions. For comparison, we plot in all these cases the results of a naive random optimization, from which it can be observed that the memristive circuit results are always way below the random assignment. For each class of problems we generated 5 instances. The minimum energy and average time per execution for each instance and class are shown in Table 7.1. We see that memristors have a runtime advantage in terms of Ipopt (a factor of 100), as these run much faster and one can initialize the system many times more in an equal amount of time. Also, we observe that the density of the Ω matrix places a significant computational burden for computing derivatives in second-order methods, such as Ipopt, which is a problem feature that the memristor-based approach avoids. The optimal solutions found by Ipopt for the Lyapunov function confirm that (within a tolerance of 10−3 ) the solutions are to be found near the boundary of [0, 1]N . These results somewhat confirm that the asymptotic states of a memristive circuit are to be found

page 209

August 4, 2021

210

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

F. C. Sheldon, F. Caravelli and C. Coffrin

in local minima of a Lyapunov function, and in the discrete of the system.

7.4. Conclusions We have discussed the properties of memristive circuits from an optimization perspective. In particular, we have derived a new Lyapunov function for a memristive circuit of an arbitrary topology, and shown that if each memory parameter is constrained between [0, 1], then the asymptotic memristor values lie on the boundary of this set. This is because the fixed point of the dynamics are unstable and of countable cardinality, as we have shown. The Lyapunov function has been derived under the assumption the dynamics lies in the bulk, thus ignoring the boundaries. We have first discussed these features in the case of a single memristor device analytically. These results have a variety of implications. In primis, this shows that it is possible to overcome some of the problems of previously proposed Lyapunov functions in the literature. Moreover, we have tested (from the standpoint of optimization) whether analog circuits of memristors can be used for minimizing nonlinear functions. We have tested three indicative classes of circuits, and found that while in none of these cases memristor dynamics obtain better minima than state of the art software (Ipopt), they are nonetheless able to obtain good quality minima when the system is initialized multiple times. From this point of view, memristive dynamics has the advantage of providing fast good quality solutions. For instance, in the case of non-planar circuits Ipopt took two order of magnitude longer than memristive circuits to provide a plausible minimum. In this sense, we have confirmed that memristive dynamics is naturally associated to the minimization of a Lyapunov function. When run in hardware, we expect this speed advantage to be substantially increased. Some comments about the difficulty of the instances we considered are in order. The class we consider, drawn from circuit structures, is previously unexplored and thus the difficulty of optimization problems in this class is unknown. We can however draw a few inferences about this class from our results. We note first that the

page 210

August 4, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

Fully Analog Memristive Circuits for Optimization Tasks: A Comparison 211

solvers we test produce a range of potential solutions, giving evidence that these instances are not simply convex and contain a range of local minima. The software Ipopt takes considerable time to find minima in the case of ER and Lattice3d, but not for Lattice2d, in which the underlying circuit is planar. We emphasize that, while the circuit is planar, the matrix Ω is dense (none of the elements are zero). This said, it has been proven that for the case of planar circuits the matrix Ω has exponentially small support on the underlying graph.19 From this point of view, our results suggest that the class Lattice2d is not as hard as the other two we consider due to such planarity hidden in the matrix Ω; this can also be seen from the fact that Ipopt takes significantly less time in finding good quality minima for this class. While we proposed a Lyapunov function for a continuous set of variables, it is still an open question whether there exist an efficient embedding of a QUBO functional in a memristive circuit such that the QUBO functional is minimized along the dynamics as well. The key issue is that while memristors reach the boundaries of the space M = [0, 1]N , it is unknown if an efficient embedding exists. This is left for future investigations. Acknowledgments This work was carried out under the auspices of the NNSA of the U.S. DoE at LANL under Contract No. DE-AC52-06NA25396, in particular via DOE-ER grants PRD20190195. Also, FCS is supported by a CNLS Fellowship. References 1. F. Caravelli and J. P. Carbajal, Memristors for the curious outsiders. Technologies. 6(4)(118) (2018). 2. L. Chua and Sung Mo Kang, Memristive devices and systems. Proc. IEEE. 64(2), 209–223 (1976). ISSN 0018-9219. doi:10.1109/PROC.1976.10092. URL http://ieeexplore.ieee.org/document/1454361/. 3. J. J. Yang, D. B. Strukov, and D. R. Stewart, Memristive devices for computing. Nat. Nanotechnol. 8(1), 13–24 (2013). ISSN 1748–3387. doi:10.1038/nnano.2012.240. URL http://www.nature.com/articles/nnano. 2012.240.

page 211

August 4, 2021

212

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

F. C. Sheldon, F. Caravelli and C. Coffrin

4. L. Chua, If it’s pinched it’s a memristor. Semi. Sci. Technol. 29(10), 104001 (2014). ISSN 0268-1242. doi:10.1088/0268-1242/29/10/104001. URL http:// stacks.iop.org/0268-1242/29/i=10/a=104001?key=crossref.ad43eb8113f35f3 a395277739b7dcf63. 5. D. B. Strukov, G. S. Snider, D. R. Stewart, and R. S. Williams, The missing memristor found. Nature 453(7191), 80–83 (2008). ISSN 0028-0836. doi:10.1038/nature06932. URL http://www.nature.com/articles/ nature06932. 6. K. G. Johnsen, An introduction to the memristor — a valuable circuit element in bioelectricity and bioimpedance. J. Electr. Bioimp. 3, 20–28 (2012). doi:10.5617/jeb.305. 7. F. Traversa and T. Di Ventra, Memcomputing: Leveraging memory and physics to compute efficiently. J. App. Phys. 123(180901) (2018). doi:10.1063/1.5026506. 8. F. L. Traversa and M. Di Ventra, Polynomial-time solution of prime factorization and np-complete problems with digital memcomputing machines. Chaos 27(023107) (2017). doi:10.1063/1.4975761. 9. F. C. Sheldon and M. Di Ventra, Conducting-insulating transition in adiabatic memristive networks. Phys. Rev. E 95(1), 012305 (2017). ISSN 2470-0045. doi:10.1103/PhysRevE.95.012305. URL https://link.aps.org/doi/ 10.1103/PhysRevE.95.012305. 10. A. Adamatzky and B. D. L. Costello, Physarum attraction: Why slime mold behaves as cats do?. Commun. Integr. Biol. 5(3), 297–299 (2012). ISSN 19420889. doi:10.4161/cib.19924. URL http://www.tandfonline.com/doi/abs/10. 4161/cib.19924. 11. C. Coffrin, H. Nagarajan, and B. R., Evaluating ising processing units with integer programming. In Rousseau lm. in stergiou k. (eds) integration of constraint programming, artificial intelligence, and operations research, Lecture Notes in Computer Science, vol. 11494 (2019). doi: https://doi.org/ 10.1007/978-3-030-19212-9 11. 12. G. Rieffel and W. H. Polak. Quantum Computing: A Gentle Introduction (World Scientific, Singapore, 1988). 13. M. Di Ventra and Y. V. Pershin, The parallel approach. Nat. Phys. 9(4), 200–202 (2013). ISSN 1745-2473. doi:10.1038/nphys2566. URL http://www. nature.com/articles/nphys2566. 14. F. L. Traversa et al., Evidence of an exponential speed-up in the solution of hard optimization problems. Complexity 2018(7982851) (2018). doi:10.1155/ 2018/7982851. 15. F. L. Traversa, C. Ramella, F. Bonani, and M. Di Ventra, Memcomputing NP-complete problems in polynomial time using polynomial resources and collective states. Sci. Adv. 1(6), e1500031–e1500031 (2015). ISSN 2375-2548. doi:10.1126/sciadv.1500031. URL http://advances.sciencemag.org/cgi/doi/ 10.1126/sciadv.1500031. 16. L. Chua, Memristor-The missing circuit element. IEEE Trans. Circuit Theo. 18(5), 507–519 (1971). ISSN 0018-9324. doi:10.1109/TCT.1971.1083337. URL http://ieeexplore.ieee.org/document/1083337/.

page 212

August 4, 2021

16:43

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch07

Fully Analog Memristive Circuits for Optimization Tasks: A Comparison 213 17. I. Gupta, A. Serb, R. Berdan, A. Khiat, and T. Prodromakis, Volatility characterization for rram devices. IEEE Electron Device Lett. 38(1) (2017). doi:10.1109/LED.2016.2631631. 18. F. Caravelli, F. L. Traversa, and M. Di Ventra, Complex dynamics of memristive circuits: Analytical results and universal slow relaxation, Phys. Rev. E 95(2), 022140 (2017). ISSN 2470-0045. doi:10.1103/PhysRevE.95.022140. URL https://link.aps.org/doi/10.1103/PhysRevE.95.022140. 19. F. Caravelli, Locality of interactions for planar memristive circuits, Phys. Rev. E 96(052206) (2017). doi:10.1103/PhysRevE.96.052206. 20. A. Zegarac and F. Caravelli. Eur. Phys. Lett. Perspectives. 125(10001) (2019). 21. F. Caravelli, Asymptotic behavior of memristive circuits. Entropy 789(21(8)) (2019). 22. F. Caravelli, The mise en scéne of memristive networks: effective memory, dynamics and learning. Int. J. Parallel, Emergent Distrib. Syst. 33(4), 350–366 (2018). ISSN 1744-5760. doi:10.1080/17445760.2017.1320796. URL https://www.tandfonline.com/doi/full/10.1080/17445760.2017.1320796. 23. W. Fulton, Algebraic Curves, Mathematics Lecture Note seriesis (W.A. Benjamin, 1974). 24. A. W¨ achter and L. T. Biegler, On the implementation of a primal-dual interior point filter line search algorithm for large-scale nonlinear programming. Math. Programm. 106(1), 25–57 (2006).

page 213

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235740 0008

Chapter 8

Organic Memristive Devices for Bio-inspired Applications Victor Erokhin Institute of Materials for Electronics and Magnetism, Italian National Research Council (IMEM-CNR), Parco Area delle Scienze 37/A, Parma, 43124, Italy [email protected] Organic memristive devices are considered as perspective elements for bio-inspired information processing. In this chapter we will consider several applications of these devices for the realization of neuromorphic circuits and networks, such as logic with memory, artificial neuronal networks, bio-mimicking electronic circuits, synapse prosthesis, and stochastic 3D adaptive networks.

8.1. Introduction It is impossible to image modern life without computers: they are used for practically all aspects of human activities. At early stages of the computer design it was supposed that it will help to understand better the brain function. However, the architecture and working principles of modern computers are very different from those of the nervous system and brain of humans and animals. Among obvious differences we can underline the following ones. First, in the nervous system and brain the same elements are used for memorizing and processing of the information, while in computers memory and processor are separated devices without direct effect of one on the properties of the other one. Therefore, the brain is more oriented to “learning”, because the information is not only recorded, but it varies also the connection within the “processor”. 215

page 215

August 3, 2021

216

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

Second, brain provides a parallel processing of the information, while in single nucleus computers only one operation is performed at a time. Therefore, the brain provides processing simultaneously large arrays of data, coming from external stimuli, arrived from sensor systems, superimposed onto the “experience”, accumulated during the brain previous activities. Finally, the way of the signal propagation in the brain allows unsupervised learning. The main paradigm of such learning is spike timing-dependent plasticity (STDP), describing the reinforcement or inhibition of synaptic connections between nervous cells according to the value of the time delay between pre- and postsynaptic spikes. This rule can be considered as a cause-and-result relationship, because as shorter is the delay of the postsynaptic and presynaptic spikes, as more probably the presynaptic spike (event 1) is a cause of the postsynaptic spike (event 2). If the event 2 occurs before the event 1, the weight of the connection is decreased. Mimicking mentioned properties requires new architectures, based on special elements, having properties, similar to those of synapses, memorizing the successful event by the variation of weight function of the connections between threshold elements (neuronal cells). During last years memristive devices are considered as best candidates for performing such function. The term “memristor”1 was introduced by L. Chua and has attracted explosive attention after the work, published in 2008.2 Recently the term “memristor” is substituted by “memristive device” or “resistance switching element” (the term was introduced before 2008) because memristors in their original definition cannot exist.3–6 On the other hand, several research groups begin to identify as “memristive devices” other types of elements with memory.7 Different types of memristive devices, based on the variety of used materials and different working principles, have been reported. Originally, these elements were considered as perspective basis for memory systems. Now, areas of their possible applications are significantly enlarged, including also neuromorphic systems, mimicking some features of nervous systems and brain. The majority of papers, dedicated to memristive devices, are based on inorganic (mainly metal oxides) active layers8–17 and the

page 216

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

217

mechanism of the resistance switching involves the formation of conducting filaments in the insulating layer with the variation of its conductivity due to redox reactions in it.18–23 Organic memristive device was developed for the mimicking synapse properties for its successive use in adaptive systems.24 As it is known since 2005,25 in this chapter we will consider briefly only important basic features of the element and will discuss in more details special applications of these devices for the realization of alternative computing systems. Particularly, we will consider hardware implementations of artificial neuron networks and neuromorphic systems, where organic memristive devices are used for mimic synaptic learning (including STDP-like unsupervised learning), frequency driven plasticity and their use in circuits, containing living beings, including coupling of live neuronal cells through artificial synapses based on memristive devices.

8.2. Organic Memristive Device Initial idea of the organic memristive device was to make an element, having two important properties of synapses: memory effect (hysteresis of electrical properties) and unidirectional conductance (rectification). Mnemotrix of Valentino Braitenberg,26 essential element of the mental experiment for the realization of adaptive vehicle systems, was taken as a benchmark with one exception: highly conducting state can be inhibited by the external action in order to prevent the entire system from the saturation and to have possibilities of supervised learning, inhibiting occasionally established connections, and unsupervised learning, when postsynaptic event takes place before the presynaptic one. The diode, reported in Ref. [27] was considered as an element, providing unidirectional (rectification) behavior of the system. It was necessary to modify the structure for having also memory behavior. The structure of the realized device is shown in Figure 8.1. Polyaniline thin layer with two connected metal electrodes is an essential part of the device.25 The working principle of the element is based on the high difference of the polyaniline conductivity in

page 217

August 3, 2021

218

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

Figure 8.1. Schematic view of the organic memristive device. Reproduced from Ref. [34] with permission from AIP publishing.

its reduced and conducting states.28 Switching between these states can be done electrochemically, applying appropriate reduction or oxidizing potentials. However, in order to have the possibility of performing these transformations, we need to have an appropriate medium (electrolyte) and the reference point of the potential. Therefore, the channel of polyaniline was covered in its central part with electrolyte stripe to which the reference electrode was attached. The electrolyte was done from polyethylene oxide doped with lithium salts.24, 25 Different salts were studied for the optimization of the memristive device properties.29 Usually, the electrolyte was in the solid form. However, in some cases we have used it in gel30 or even in liquid state.31 Silver wire was used as a reference electrode in most of the cases, because it has demonstrated the best performance for the device properties among all investigated materials. The thickness of the polyaniline channel is rather critical for allowing redox reactions on the whole depth. Therefore, it was mainly fabricated by Langmuir–Blodgett technique,25 that allows nm resolution. However, so-called layer-by-layer method was used in some cases.32, 33 Reference electrode is connected directly to one of metal electrodes that is maintained at a ground potential level. Therefore, regarding the external circuit the element can be considered as a two-terminal one. However, two types of currents flow in the device: electronic (more precisely, hole) current in the polyaniline channel (main contribution to the conductivity in the ON state of the device)

page 218

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

(a)

219

(b)

Figure 8.2. Typical cyclic voltage current characteristics of organic memristive device for electronic (a) and ionic (b) currents. Reproduced with permission from c 2020 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim. Ref. [31]

and ionic current, occurring between electrolyte and the channel and responsible for the resistance switching of the entire device. It has been shown that in these devices the resistance value is a function of the ionic charge that has passed through the device (in the case of “ideal memristor” it must be a function of the total charge), what has been demonstrated also directly by X-ray fluorescence measurements acquired along with electrical characterization.30 Typical cyclic voltage–current characteristics of these devices are shown in Figure 8.2. Increasing the voltage from 0 V we can observe low current values both for electronic and ionic contributions till the voltage reaches a certain value (about 0.5 V), when we begin to observe significant increase of the electronic current, accompanied by the presence of the positive maximum of the ionic current. The material in the active zone (contact of polyaniline channel with solid electrolyte) transfers into oxidized conducting form (ON state of the device). The device maintains its ON state during further increase of the applied voltage till the max value (1.0–1.2 V) and during the initial part of its decrease till about 0.1 V, when we can see the decrease of the electronic conductivity, accompanied by the appearance of the negative maximum for the ionic current characteristics. At this point the polyaniline in the active zone is transferred into a reduced form and the device comes to its

page 219

August 3, 2021

220

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

OFF state, that is maintained for the whole negative branch of the cyclic voltage–current characteristics. Application of constant positive (higher than threshold) or negative (any) value of the voltage results in a gradual increase (decrease) of the device conductivity. It is to note that the time, necessary for the transition from an insulating form to a conducting one, is significantly longer than for the back transition from the conducting form to an insulating one. Such behavior was well explained by the developed models.35–37 It is to note that initially all experiments were carried out in a DC mode. However, successive experiments have shown that the device works in a similar way also in a pulse mode.38 It seems useful to make a comparison of main properties of the organic memristive devices with those of inorganic ones. Regarding obvious advantages of the organic devices with respect to inorganic ones we can mention light weight, low cost, flexibility and low energy consumption.39 Being based on adequate materials and/or with suitably modified surfaces, these devices are biocompatible, allowing even implant applications.40–44 One of the main characteristics of such devices is ON/OFF conductance ratio. It is to note, that in the case of organic memristive devices this value, after the optimization of the composition and architecture, can reach 105 , that is among the best values in the literature for this parameter.34, 45 Thus, regarding this value the device is among the best available up to now. An extremely important advantage of these devices comparing to memristive systems, based on filament growth, is the fact that resistance switching potentials are fixed, while in the case of mentioned inorganic devices, these values can be significantly different not only from one device to the other, but also for different cycles of the same device. One other important advantage of our systems is that they do not require forming processes, that are essential features of the most of inorganic memristive devices, what requires the application of high voltages that can be very critical extremely when interfacing with living beings and cells. Considering drawbacks, it is to mention that the switching time between different resistance states is more than in the case of inorganic devices.

page 220

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

221

However, as it has been demonstrated, the speed of switching is scalable with the decrease of the device lateral sizes36 and thickness of the active channel.46 Considering the fact that minimum reached lateral sizes of the organic memristive devices was 20 microns, we can expect a significant increase in the switching time when the sizes will reach nm range. However, considering high ON/OFF ratio even now the devices allow to mimic synapse characteristics in the temporal range, comparable with that of processes, occurring in nervous systems and brain.47 Another significant drawback is connected to the lower stability and endurance of the organic devices with respect to the inorganic ones. In fact, the best reported temporal stability of organic memristive devices is about half a year and the endurance is about 105 cycles.46 If we consider such applications as memory arrays, this drawback is very critical. In fact, it will be significantly lower than the features of commercially available memory units. However, regarding neuromorphic applications this feature is not very critical. Moreover, it can be an advantage when the system must be balanced between short- and long-term memory and adaptations. In fact, when working in a pulse mode the frequency plays a key role in potentiation and inhibition of synaptic connections. Therefore, the connection must have a certain degree of volatility for performing a brain-like information processing. The last drawback is connected to the fact that organic memristive devices are beyond the existing electronic technology. It is to note that not all inorganic devices of such type are ready to be included into standard processes of the fabrication of traditional electronic circuits. Therefore, currently an increasing attention is dedicated to devices, where the active layer is done from HfO2 , because this material is widely used for the fabrication of traditional CMOS circuits.48–51 Nevertheless, organic materials have found several important applications during last years and their number is gradually increased. Thus, we can expect that in a short time alternative technological approaches for the realization of commercially available electronic devices will eliminate completely this drawback of the use of organic materials for the memristive devices fabrication.

page 221

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

222

b4205-v2-ch08

V. Erokhin

8.3. Adaptive Circuits Alternative computing requires different approaches for the data treatment. In particular, Boolean logic functions must be modified in order to have a possibility of memorizing events. For example, making parallel to humans and animals, AND function can be attributed to the identification of the object when two or more essential properties are present. However, this association cannot be done automatically, but comes from the accumulated experience. Therefore, the output of AND gate can have a continuously increasing values between zero and one according to the duration and/or frequency of the repetition of the inputs configurations, confirmed by positive feedback. If feedback value is negative, the output signal will be decreased. It is to note that it is enough to use only one memristive device for carrying out such function. Three main logic gates with memory (AND, OR, NOT) have been reported using organic memristive devices.52, 53 As an example, a scheme of AND element is shown in Figure 8.3 and the temporal variation of the output signal for different configurations of the input signals are shown in Figure 8.4.52 Similar dependences have been also studied for OR and NOT elements with memory.52 In the case of the AND gate with memory we have done a comparison of the characteristics of the element, done from organic memristive device and that with traditional memristive device, based on inorganic metal oxide layer. The obtained results have demonstrated that the systems, based on organic memristive devices are more adequate for the realization of analog systems with rather slow gradual variation of the output signal with the duration and/or

Figure 8.3. Scheme of the logic gate AND with memory. Reproduced with c 2020 World Scientific Publishing Company. permission from Ref. [52]

page 222

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

223

(a)

(b)

(c)

Figure 8.4. Temporal dependence of the output current of AND element with memory (a) as a result of application of signals to the first (b) and second (c) c 2020 World Scientific Publishing Company. inputs. Ref. [52]

frequency of the application of adequate input signals, while in the case of inorganic memristive devices we have very rapid resistance switching, practically without intermediate states, that seems more suitable for the realization of digital logic elements with memory. The variations of the output signal in time for these memristive devices are shown in Figure 8.5. Artificial neuronal networks form a very important class of adaptive networks. As it was stated in the introduction, initially these systems were considered for the hardware realization, while currently

page 223

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

224

(a)

(b)

(c)

Figure 8.5. Temporal dependence of the output current of the AND with memory element, based on inorganic memristive device, (a) as a result of the application of voltages on the first (b) and second (c) input electrodes. c IOP Publishing. Reproduced with permission. All Reproduced from Ref. [53]. rights reserved.

they are mainly realized at the software level. The increase of the activity in the area of hardware implementation during last years is connected to two facts: first, it must provide parallel computing, second, resistive switching memristive devices have provided a significant simplification of the schematic realization of connections, capable to vary their weight functions (ideally — one element for each connection).

page 224

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

225

Perceptron is a fundamental element of the artificial neuronal networks. Perceptron was developed by Rosenblatt in 195753 as a model of brain perception. The scheme of the perceptron is shown in Figure 8.6. The perceptron contains threshold elements, allowing further signal propagation only when the sum (integral) of the input signals is higher than a certain value. These threshold nodes are arranged in layers, and each element of the previous layer is connected with all elements of the successive layer with connections, capable to vary their weight function (conductivity, in our case). By the definition, initially the perceptron has stochastic distribution of weight functions (resistances) of the connections, resulting in no correlation of the value of the output signal to the vector of input signals. However, after adequate supervised training procedure, weight functions are adjusted in such a way that the perceptron is able to classify objects (corresponding to different input configurations) according to the performed training algorithms. Usually, a back error propagation algorithm is used for training perceptrons.55 In the simplest case it is a single layer elementary perceptron, introduced by Wasserman,56 where each sensor neuron was directly

Figure 8.6.

Scheme of the perceptron.

page 225

August 3, 2021

226

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

connected to each neuron of associative layer. Therefore, this type of perceptron does not require intermediate neurons and there is only one layer of connections with variable weight functions. It is to note that the perceptron of Wasserman is not functionally equal to that of Rosenblatt. As a conclusion, the perceptron of Rosenblatt should be able to make classification of objects, not linearly separable, while that of Wasserman should not. The starting point of the elementary single-layer perceptron, realized on organic memristive devices, was to perform classification according NAND logic function, where classified objects can be linearly separated. The realized scheme is shown in Figure 8.7.57 The input layer consists of two functional inputs, corresponding to the characteristic features of the objects that must be classified. The layer contains also permanently biased input, providing an offset for the function that must separate objects into two classes. Such system can distinguish only two classes of objects: if the output signal is one — the object corresponds to the class A, and if the value is zero — the object corresponds to the class B. The number of recognizable classes can be increase by adding simply the number of output neurons with necessary memristive connections to input neurons. An important critical issue is the perceptron training algorithm. The simplest algorithm that was suggested by Rosenblatt was chosen. It is a method of error correction,55 wherein the weights of all active

Figure 8.7. Scheme of elementary perceptron based on organic memristive c 2015 Elsevier. devices. Reprinted from Ref. [57] with permission.

page 226

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

227

connections are changed in the same way in each iteration step depending only on the error sign but not on its magnitude.57 To demonstrate the working principles of our perceptron the simplest tasks were chosen such as classification of inputs by the logic functions NAND and NOR. Each of these two feature inputs is binary one, assuming the value of logical “0” or “1”. The function response is generated on the output neuron according to which the incoming images are divided into two classes, also encoded by logical “0” or “1”. For example, in the case of classification according to the NAND function, images given by the feature vectors (0, 0), (0,1) and (1,0) belong to the class “1”, and the image (1,1) — to the class “0”. Different classes are separated by a straight plain, as it is shown in Figure 8.8. Consequently, the results of NAND and NOR are linearly separable. In our case applied voltages were used as input signals, while the output signal was the current value. It was important to use the input values, corresponding to logical one and zero in such a way

Figure 8.8. Geometrical representation of the perceptron output (the separating plane) when it has been trained according to the NAND function. Reprinted from c 2015 Elsevier. Ref. [57] with permission.

page 227

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

228

that they do not vary the conductivity states of memristive devices. According to the performed device characterization,57 these values were chosen as +0.4 V for a logic one and +0.2 V for a logic zero. Really, as it was previously demonstrated the conductivity state of our organic memristive devices are maintained within this range of applied voltages.24, 25 The iterative procedure of error correction based training algorithm was the following one. On each step of the procedure we have applied input signals xi in series (four possible input combinations of applied voltages) and measures the sign of the error value (err = y −yd , where yd is a desired output and y is an actual one). If err has a negative value, then it is necessary to apply potentiation stimulus to the active inputs. If err has a positive value, then it is necessary to apply depression stimulus to active inputs. If err = 0 no action is required. Therefore, the body of training cycle (one epoch) consists of the application to the network all four different combinations of input vectors and of weights adaptation on each step if necessary. When there is a sequence of four zero errors, we have assumed that the perceptron has been trained to perform the desired classification according to the chosen logic operation. Truth values for the NAND function classification are shown in Table 8.1. Two other values of the applied voltages were chosen for the reinforcing of the weight functions (+0.7 V) and their inhibition (−0.2 V). Time intervals, required for the performing reinforcement and inhibition functions were estimated analyzing the temporal variations of the output current at fixed applied chosen voltage values (+0.7 V and −0.2 V).57 It was found that there is only very small Table 8.1. Truth values for the NAND functions. In1 0 1 0 1

In2

Out

0 0 1 1

1 1 1 0

page 228

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

229

differences of these time constants from one sample to the other, but in average these time constants were found to be 400 s for the reinforcement and 50 s for the inhibition (as it was mentioned in the previous section — transfer from the insulating state to the conducting one is much slower than the back transition due to the fact that both oxidation and reduction potentials occur at positive potentials). The perceptron constructively consisted of three input voltage sources, three PANI-based memristive devices and a so-called neuron (neuron in our case is a controlled ammeter). In this case the perceptron output is the current through the ammeter. If it was less than threshold current (i.e., It − Id ), then the program classifies this output as “0”. If the current was more than It + Id , then the program classifies this output as “1”. The demarcation current Id was used to avoid the errors in misclassification of objects with the values close to It . In this case, It and Id were 3μA and 0.5 μA, respectively, and duration of learning pulses were 600 s for potentiation and 30 s for depression. The obtained results have shown that the system requires 15 steps of the procedure to train the system for performing NAND function. The system memorized the state at least 1 h after the training procedure. The training was performed for different time intervals. A decrease of duration of pulses results in an increase of the number of learning steps. It was found an optimal pulse duration when the whole training procedure is the shortest. It occurs at 200 s per potentiation pulse in our case. Other parameters that influence the number of learning steps are It and Id . The higher the It and Id the more steps are required. After the implementation of learning according NAND function on the PANI memristors-based perceptron we decided to re-train it in order to realize the classification according logic NOR function. This was done with the same parameters for input voltages and output currents. In this case, only two steps with potentiation pulse duration of 200 s were required to re-train the perceptron. These steps were the (0, 1) and (1, 0) combinations on functional inputs in which the output of NOR differs from that of NAND. After these steps the output error was zero for all incoming input combinations of NOR function.

page 229

August 3, 2021

230

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

The presented data have demonstrated the ability of training of the elementary perceptron made from the PANI-based memristive devices. It is to note that such kind of perceptron is able to classify only linearly separable objects. In order to make a geometrical representation of such classification, Figure 8.8 represents the values of NAND function (bars) and the plane, y(x) = w0 + w1x1 + w2 × 2, separating the different classes of input patterns that correspond to output “1” and “0”. The conductivity of permanently biased memristor w0 serves as an offset of the separating plane. The training of the perceptron by the adaptation of memristors conductivities is going up until the plane will arrive to the correct position separating the selected classes of input signals. From this point of view, the second training of perceptron from NAND to NOR functions corresponds to the increase of absolute values of negative weights w 1 and w 2 until the separating plane will become lower than “0” output value for input combinations (0,1) and (1, 0). The results of this work, along with the well-known ways to implement formal neurons based on CMOS technology,58, 59 indicate the possibility of physical realization of complex artificial neuronal networks using a large number of organic memristive devices as variable links. Such adaptive neuromorphic networks represent the basis for compact, flexible, low-weight, low-volatile and high-performance neural chips for a new generation of smart devices used almost in all spheres of human activity. However, as it was mentioned above, the elementary single-layer perceptron can classify only linearly separable objects. When it is not the case, number of used layers must be increased. Thus, the main goal of the successive development of this work was the hardware realization of a simple double-layer perceptron based on polyaniline memristive devices able to solve linearly nonseparable tasks. In this part of the work, we will consider the first steps towards the realization of the double layer perceptron. In particular, the realized system was tested for the capability of classification according the XOR logic task. The principal scheme of the network, shown in Figure 8.9,60 consisted of two inputs (X1 , X2 ), two neurons (several in general

page 230

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

231

Figure 8.9. Logic scheme of the implemented neural network with 2 inputs, 2 hidden and 1 output neurons. Reproduced from Ref. [60] licensed under a Creative Commons Attribution (CC BY) license.

Figure 8.10. Circuit diagram of the artificial neural network memristor-based hardware with circled “neurons”, each consisted of differential summator and activation function. Access system is shown for M111+ and M121+ memristive devices and is omitted for others for simplicity. Reproduced from Ref. [60] licensed under a Creative Commons Attribution (CC BY) license.

case) within the hidden layer and an output neuron (or several neurons in general case). Inputs and neurons were connected by links with specific synaptic weights (wij , wjk ). The circuit diagram of the network based on memristive devices is shown in Figure 8.10.

page 231

August 3, 2021

232

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

Each weight was represented by two memristive devices (see below). An essential requirement for training of the network is the ability to change the resistance (proportional to the synaptic weight) of every memristive device independently from others. To manage this requirement we developed an access system based on CMOStransistors as the voltage-controlled switches. This system allowed to apply a desirable voltage to the selected memristive device within a training procedure or to read the voltage during the information processing. Such a switch connects each memristor either to one of the inputs when being biased by some non-negative voltage or to the reference voltage source (+0.2 V). A commutator composed of one 1-in-8 analogue switch (considered as a “master”) and two more (“slave”) connected in series allowed us to control all 12 switches in the circuit by the five logic inputs (Figure 8.11).

Figure 8.11. Logic scheme of used commutator with 5 logic inputs (L0–L4) and 16 outputs (only 12 of them were used according to the number of memristive devices). Separate output “All” corresponds to the application of control voltage (+15 V) to all memristive devices access systems (during reading some input vector by the perceptron). In absence of control voltage, −15 V was applied to the access system due to the necessity of applying +0.2 V to all memristors. Reproduced from Ref. [60] licensed under a Creative Commons Attribution (CC BY) license.

page 232

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

233

Details of the circuit working principles are presented in Ref. [60]. Since a double-layer perceptron is able to solve linearly nonseparable tasks, the classification according “XOR” logic function was chosen to be performed by the network. This task cannot be solved by elementary (single-layer) perceptron, where each output neuron implements one hyper-plane separating the classes. Nevertheless, the second layer neurons in a double-layer artificial neuronal networks perform the separation in a space of the first layer, enabling union, intersection and difference of the “subclasses” highlighted by the hidden layer of the network. In machine learning, the back propagation with batch correction learning algorithm61 is widely used for non-separable task solving. The algorithm comprises the calculation of the gradient of a squared error function with respect to all the weights in the network. The gradient is fed to the optimization method which uses it to update the weights for minimizing the squared error function. It requires a very precise variations of weight values. This item was a critical one for the hardware perceptron realization due to the fact that resistive switching kinetics of memristive devices were not similar enough for unified mathematical model. Therefore, it is possible to follow only the weight correction direction (sign), but not its value, choosing the empirically established training pulse time duration. Such modification of the back propagation learning algorithm leads to the strong correlation of the necessary number of steps to converge with the initial weights distribution: closer it was to the final distribution, the less number of steps was required. It is to note that even not all initial states of the network led to the convergence. Possible solutions of this critical issue could be an implementation of other algorithms based on spike timing-dependent plasticity (STDP) rules62 or realization of the circuit where conductivity of each element would be measured with a contactless spectrophotometric method.63 Each step of the training procedure consisted consecutively of an application of the whole training set of vectors x(k) (k = 1, 2, 3, 4), actual weight measuring (applying the “reading” pulses) and weight correction (applying the “writing” pulses). The correction pulse duration values were chosen to minimize the duration of training

page 233

August 3, 2021

234

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

steps, and it was kept constant (but different for depressing and potentiating pulses) for all steps in the whole learning procedure. The procedure was performed until convergence. Figure 8.12(a) shows results of the learning procedure for XOR logic function at the first and last iterations. Figure 8.12(b) depicts the weight values change after learning. As described above, each weight was adjusted by two memristive devices (their conductances are not shown separately) and set in arbitrary units. As shown in Figure 8.12(c), the weights were adjusted so that two output classes were separated by two planes in the feature space. In this part of the work, it was shown that memristive devices can be used for multilayer artificial neuronal networks hardware realization. It was built a double-layer system that paves the way for the realization of a multi-layer perceptron, demonstrating the possibility of performing linearly non separable classification according to the XOR logic function. This approach could be extended (but not directly) to larger artificial neuronal networks and other machine learning algorithms for more complex and data-intensive tasks.

8.4. Relationship of Optical and Electrical Properties The redox reactions taking place in the active area of the memristive device have been measured in real time by optical reflectance spectroscopy in the visible and near infrared region using a fiber probe reflectometer.36 In Figure 8.13, the spectra of the polyaniline, corresponding to the insulating and conductive state of the device are shown and compared to the reflectance of the pristine film. The spectra of conductive and insulating film correspond respectively to the leucoemeraldine and to the emeraldine salt forms, while that of pristine sample to emeraldine base. The identification of the polyaniline forms was compared with spectra reported in Ref. [28]. The device in its insulating and conductive state displays respectively the typical pale yellow color of leucoemeraldine and the green of emeraldine salt. The reported dependence shows the distinct difference of the polyaniline optical properties in conductive and insulating states.

page 234

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

235

(a)

(b)

(c)

Figure 8.12. Experimental data. (a) Output signal within the epochs before (left) and after (right) training and expected output signal (dotted). (b) Synaptic weights and (c) corresponding feature plane partition (area above and below the plane y = 4, 5 is the class “1” and “0”, correspondingly). Obtained separating planes are implemented by corresponding neurons in the first layer. Reproduced from Ref. [60] licensed under a Creative Commons Attribution (CC BY) license.

page 235

August 3, 2021

236

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

Figure 8.13. Reflectance spectra of insulating, conducting and pristine polyanic 2011 Springer. line film. Reprinted with permission from Ref. [36]

In order to have a quantitative information on the transfer of the resistance states a spectroscopic imaging technique was used.64 It is to note that it allows to visualize the transitions in single devices as well as to make a mapping of large areas, composed of numerous discrete devices or by complex media composed of different materials. The optical system works as follows (see Figure 8.14). The lens focalizes the image of a surface on the plane of the input slit of the spectrometer, but only the light coming from the strip conjugated with the slit that enters into the spectrometer. The light is dispersed by the spectrometer and focalized in the plane containing the sensor of the camera. The spectrometer has a 1:1 image magnification, the image of the input slit is focalized on the pixel rows of the sensor, while its position along the vertical axis of the sensor, depends on the light wavelength. For examples, if the light entering the slit is red, the image of the slit will be focalized on the top rows of the sensor while if the light is blue it will be focalized on the bottom. White light in the range 400/780 nm entering into the slit fills the whole sensor.

page 236

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

237

Figure 8.14. Scheme of the main optical components arranged in the scanner. c 2015 Elsevier. Reprinted from Ref. [64] with permission.

The system was used for visualization of the real time electrochromic variation of a PANI memristor under oxidation and reduction bias potential. The PANI bulk reduction potential is 0.1 V, while the oxidation one is 0.3 V.24 However, our case is slightly different because redox processes take place in a small area, defined as active zone. Thus, we need to consider the voltage distribution along the whole memristive channel. The process dynamics have been analysed in detail in Ref. 24 where different kinetics for reduction and oxidation reactions were defined and the explanation was suggested. According to this explanation, we can say that the reduction occurs simultaneously in the whole active zone (when any negative potential is applied to the device), while the oxidation takes place gradually with a progressive displacement of the conductive parts in the active zone from the drain to the source electrodes. Initially, the memristive device was characterized by recording the temporal variations of the current values (shown in Figure 8.15), applying DC voltage of +0.8 V for 22 min to induce oxidation of PANI and, after, −0.1 V for 16 min to induce reduction.63 These measurements, defined as current kinetics, have been performed sampling a current value each second. The collected images, containing the reflectance spectra, clearly show the chromic variation occurring in the active zone. Figure 8.16 illustrates two typical spectra acquired at the end of each redox process; one is related to the leucosmeraldine form and the other one to the emeraldine salt form.

page 237

August 3, 2021

238

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

Figure 8.15. Kinetics of the total device current variation upon the application of +0.8 V (black curve) and −0.1 V (blue curve), respectively. The insets are optical microscope photographs of the sample taken at the end of each trend and revealing the active area color variation. The two squares in the insets’ active areas, α and β, defined the two zones we used for calculating the reference spectra c 2016 Elsevier. (see Figure 8.16). Reprinted from Ref. [63] with permission

The curves in Figure 8.16 have different shapes: the emeraldine salt one presents a broad peak centred in 510 nm, while the leucosmeraldine one has a big broad band above 550 nm. If we compare with Ref. [36] the two measured forms, they clearly correspond to two different colors: the oxidized form of PANI is an intense green, while the reduced one is yellow, as expected. The direct demonstration of the validity of the use of the proposed method for the following the resistance switching kinetics of memristive PANI devices is shown in Figure 8.17 for the depression process. In the case of the reinforcement process, the results are rather similar. In this part of the chapter, it has been demonstrated a validity of a spectroscopic imaging method for the determination of the resistance state of a PANI memristive devices. The results allowed to confirm an explanation of the different kinetic of the redox

page 238

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

239

Figure 8.16. Reflectance spectra measured when the active zone was in the two different redox states. We highlighted and calculated the areas under the two curves. The two insets in the graph named a and b are the spot we referred in Figure 8.15 that gave us these reference spectra. Reprinted from Ref. [63] with c 2016 Elsevier. permission

Figure 8.17. Comparison of optical and electrical measurements during the process of depression of the conductivity of the organic memristive device: red curve for the area’s fit and blue dashed curve for the current variation. Reprinted c 2016 Elsevier. from Ref. [68] with permission

page 239

August 3, 2021

240

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

processes, suggested in Ref. [24]. Dealing with the spectrophotometer capability to acquire large areas64 (up to half a meter) such a characterization could be extended to many samples or complex medium at the same time. This work represents a significant step in the perspective of monitoring complex memristive devices based networks, such as multi-layer perceptrons. Indeed, applying this method to a perceptron it will be possible to monitor the conductivity of each device avoiding the electrical checking. In fact, electrical measurement of the memristive devices conductivity states can disturb the whole system, affecting the correct working procedure of the system. It is to note that this technique can be used not only for the monitoring the resistance state of single devices, but also for the control of the system state when it contains other (even biological) elements. For example, it was shown that the growth of slime mould Physarum polycephalum varies the conductivity and color of the polyaniline layer.64 8.5. Neuromorphic Applications 8.5.1. Frequency dependent plasticity Important feature of nervous systems and brain is that the strength of the connections between nervous cells depends not only on the number of the passed spikes, but also on the frequency of their income. Typically for inducing long-term potentiation a 2–5 ms stimuli delivered at 100 Hz are required, while for inducing the depression, the required frequency is reduced to 1 Hz.65–68 For this purpose, organic memristive devices were tested applying pulses with the same shape (in terms of amplitude and duration) for both potentiation and depression but varying the frequency of the spikes passing, in analogy with the typical shape of the biological Action Potential (Figure 8.18(a)). Long-term potentiation (LTP) and long-term depression (LTD) were induced according the natural ratio between the pulses duration (5 ms) and time intervals between them (from about 10 to about 1000 ms, respectively) that is about 0.5 in the first case and 0.005 in the latter one.65 The typical voltage

page 240

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications (a)

241

(c)

(b)

(d)

(e)

Figure 8.18. LTP and LTD induction with different frequency pulses: panel (a) biological Action Potential and the pulse used for the experiment: dotted line highlights the “resting” potential; panel (b) a typical voltage profile used in the experiment: the conductivity was acquired through a reading phase before and after the training routine; panel (c) detail of the pulse used in the experiment as training routine: dotted lines highlighting the oxidation and reduction threshold voltages; Δt is the time interval between the end of the first and the begin of the second pulse; panel (d) map of the variation of the current (ΔI/I0 ) as a function of the number and of the frequency of spikes applied to the memristor in the LTD regime; exploring low frequencies pulses, the memristive device exhibits a negative variation of current, indication of the performing of a depression process; panel (e) map of the variation of the current (ΔI/I0 ) as a function of the number and of the frequency of spikes in the LTP regime. In the case of higher frequencies, the increment of the device conductivity, and so the successfully performed LTP, is demonstrated by a positive variation on the current between the initial and the c 2019 Elsevier. final current values. Reprinted from Ref. [69] with permission

profile used in the experiment is shown in Figure 8.18(b): the training routine is reported in Figure 8.18(c), while the initial and the final intervals correspond to the reading phases.69 For the low frequency stimuli, all the tested combinations of number and frequency of the input signals led to the decrease of

page 241

August 3, 2021

242

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

the device output current (Figure 8.18(d)) that results in a negative variation of the output current value (ΔI/I0 < 0). In other words, even a relatively short training routins can induce an effective depression of the device conductivity. This behavior is well documented in biological synapses,68 where the effect of a prolonged low fixed frequency stimulation is comparable to multiple sequences of the stimulations with shorter periods. An individual short stimulation in the latter protocol induces LTD of small entity that, however, can be cumulated with successive sequences of inputs. As it is shown in Figure 8.18(d), the lower is the frequency the smaller is the number of spikes, necessary to switch off the device conductivity while, increasing the frequency, a more intense training routine should be applied to achieve the same result. When a high frequency signals are applied to a single organic memristive device (Figure 8.18(e)), it increases its internal conductivity proportionally to their number and the frequency. To obtain the highest current change, the concomitant presence of both high frequency and large number of applied pulses is necessary. This behavior emulates perfectly the already mentioned biological temporal integration as well as the biological LTP dependence on the number of the stimulations (for a fixed frequency).66 Moreover, in Figure 8.18(e) a region where the training routine is not effective can be highlighted: under a certain stimulating frequency the organic memristive devices do not produce any increase of the conductivity even with the application of an intense training pulses. In addition, for all the frequencies explored under a threshold number of input stimulations, the organic memristive device does not show any significant variations in its internal conductivity. Results reported in Figure 8.18(e) demonstrate also the nonlinear synaptic temporal integration. In fact, above a certain number of given incoming inputs, all the responses to the higher frequencies show a gradual variation of the current that increases non linearly with the number of stimuli in a very good agreement with the LTP biological behavior. Summarizing, the overall trends finely reproduce in several aspects the typical LTP and LTD65–67 functions of biological

page 242

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

243

synapses. In analogy with other inorganic or organic devices (memristive or not), organic memristive devices present a transition between LTD and LTP as a function of the polarity of the incoming stimuli. A feature that strongly distinguishes them from other kinds of devices is, however, the possibility of inducing the transition between depression and potentiation by means of a frequency variation using a fixed shape pulse instead of a polarity change. 8.5.2. Nervous system mimicking circuits The first circuit, mimicking somehow the learning of Pavlov’s dog in DC mode, was based on one organic memristive device only.70 The system contained two inputs (“food” and “bell”, respectively) and one output. In that case the value of the device conductivity and, therefore, the output current was shown to be a function of the duration of the simultaneous application of both input signals. A significant step forward has been done by the reproduction of the part of the nervous system of pond snail Lymnaea stagnalis responsible for learning of this animal during feeding.71 This system was chosen as our biological reference because the processes of signal propagation in the nervous system of this animal have been experimentally studied in great detail,72–76 leading to the construction of computational models of the electrophysiological basis of plasticity and electrical pattern generation77, 78 in the neural circuitry which was used as a guide for the artificial implementation with memristive devices based circuits.71 The main idea of this part of the work was to fabricate a circuit capable to mimic learning of the pond snail during feeding. Learning in this case means the association of an initially neutral chemical input (conditioned stimulus (CS); amyl-acetate) with the initiation of a series of rhythmic movements of the feeding muscles via which the animal captures and ingests food. Without training, these feeding movements are normally elicited in the presence of food particles (unconditioned stimulus (US)), but not in the presence of the CS only. Training is achieved through single-trial, foodreward classical conditioning, a learning protocol consisting of a

page 243

August 3, 2021

244

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

simultaneous application of amyl-acetate and food stimuli to the sensory apparatus of the animal, after which application of the CS only can trigger movements of the feeding muscles.79 In other words, the animal has learned to respond to a previously neutral stimulus (CS). It is important that only two synapses were required for making possible such kind of learning, as it is shown in Figure 8.19.79 The circuit, based on a heterosynaptic learning mechanism, is shown in Figure 8.20. Its topology is very similar to the biological model (Figure 8.19) with two memristive devices in cascade

Figure 8.19. Diagram showing the interactions of the cerebral giant cells (CGCs) with sensory (SNs) and command interneurons (cerebrobuccal interneurons (CBIs)), which mediate the increased response of the system to the conditioned stimulus (CS) after conditioning. Arrows indicate synapse positions. Reprinted c 2011 Springer. with permission from Ref. [71]

Figure 8.20. Diagrams of model circuit heterosynaptic connectivity for mimickc ing learning in Lymnaea stagnalis. Reprinted with permission from Ref. [71] 2011 Springer.

page 244

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

245

Figure 8.21. Corresponding experimental results mimicking learning in Lymnaea stagnalis by the circuit shown in Fig. 8.20. Reprinted with permission from c 2011 Springer. Ref. [71]

corresponding to synapses, marked with arrows in Figure 8.19. A periodic signal applied to input 1 has a sufficiently high amplitude for switching one memristive device, representing a synapse, to the conducting state. However, as this amplitude is distributed between two memristive devices, it is not sufficient to make neither of them conducting. During training, the dc potential on input 2 is sufficient to switch the second memristive device (the one close to the output) to the conducting state. After this, the dc component of the periodic potential of input 1 is mainly located on the first memristive device and its value is sufficient for its transition to the conducting state. Experimental results of the output signal before and after training as a result of the action of the periodic input signal only (CS analog) are shown in Figure 8.21. Both the amplitude and offset values of the output signal were increased by approximately a factor of five compared with initial values. This behavior is in a good agreement with the modulatory role of heterosynaptic connections in biological neural circuits and their presumed involvement in the establishment of long-term memory.80 Moreover, this modulatory input can trigger a cascade of intracellular molecular events, which lead to relatively long-term modifications of synaptic function. The realized system shows the adaptive behavior that mimics part of the associative learning processes after classical conditioning algorithm, that is, the synthetic structure implements a bio-inspired

page 245

August 3, 2021

246

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

heterosynaptic mechanism. It is to note that in this case we have mimic not only the function (what was done in several papers, dedicated to mimicking Pavlov’s dog learning),81–84 but also the architecture of the nervous system, responsible for this learning. Next step in mimicking learning of living beings was testing of the applicability of STDP algorithm.85 For the STDP implementation, the connected gate and drain electrodes were assigned as a presynaptic input and the source electrode was a postsynaptic one. Identical potential pulses as preand postsynaptic spikes were used with a constant applied bias voltage of +0.2 V to avoid changes in conductivity between spikes. Amplitudes of the spikes were chosen to be 0.3 V, so the maximum voltage drop across a memristive element was equal to +0.8 V and the minimum one was equal to −0.4 V. Full length of the pulse was 800 s, a half of this time the potential increased from 0 to +0.3 V and the remaining time the potential increased from −0.3 to 0 V. Postsynaptic pulses were applied after presynaptic pulses with a delay time Δt (both positive and negative). The interchange of pre- and postsynaptic electrodes changes only the sign of the delay Δt values. Resulting pulses are shown in Figure 8.22(a). Resulting voltage (the difference between post- and presynaptic potentials) across the memristive element during the measurement for the delay time Δt = 200 s is shown in Figure 8.22(b). Conductance values were measured by the application of the testing voltage of +0.3 V for 30 s before and after the pre- and postsynaptic pulses sequences. Generally, in neuromorphic applications of memristive elements, synaptic weights are equal to their conductance. Thus, weight changes were normalized to the initial value of the conductance. We measured weight changes due to the STDP for Δt = ±1000, ±600, ±400, ±200 and ±100 s, each time resetting conductance value before the application of spikes. Thus determined weight change dependence on the delay time (STDP window) averaged over several samples is shown in Figure 8.23.85 In this work, it was experimentally demonstrated that the conductance state of organic memristive device can be changed by the STDP mechanism. The experimental STDP window is in a good

page 246

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

247

(a)

(b)

Figure 8.22. (a) Shapes of presynaptic and postsynaptic potential pulses. (b) Resulting voltage across the memristive element for the specific Δt = 200 s value. c 2018 Elsevier. Reprinted from Ref. [85] with permission

Figure 8.23. STDP window for the organic PANI-based memristive element — relative weight (conductance) changes for different delay Δt values: experimentally obtained (squares) and simulated (circles and lines). Reprinted from Ref. [85] c 2018 Elsevier. with permission

page 247

August 3, 2021

248

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

agreement with the data available for the biological synapses86 and can provide a basis for unsupervised learning.87 The results could be further used for the simulation of the hardware realization of spiking artificial neuron networks based on organic memristive elements as synapses. Further development of these results allowed to construct a circuit, mimicking Pavlov’s dog learning according to the STDP mechanism. The work we will consider now is based on other type of memristive devices, working principles of which is based on the filament growth inside parylene layer (similar results were also obtained on PANI-based memristive devices (manuscript in preparation)). The description of parylene-based element can be found in Ref. [88]. As the obtained results are similar to those for PANI device, we will consider them here. Identical voltage pulses were used as pre- and postsynaptic spikes of heteropolar bi-rectangular (inset in Figure 8.24(a)) or bi-triangular (inset in Figure 8.24(b)) shape.89 The amplitudes of bi-rectangular and bi-triangular spikes were chosen to be 0.7 V and 0.8 V, respectively, so the spike itself could not lead to a conductivity change in the structure. On the other hand, if two spikes are summed up, the voltage drop across the memristive device will be increased up to ±1.4 and ±1.6 V, which is within the switching range of the sample.88 The pulse half-durations were 150 and 200 ms with the gap of 50 ms. Postsynaptic pulses were applied after (before) presynaptic pulses with varying delay time Δt (ranged from −500 to 500 ms with a step of 50 ms).89 As it is clear from Figure 8.24, the experimental results obey the rule similar to STDP one, observed in biological systems.62 Synaptic potentiation (ΔG > 0) was observed for Δt > 0, and synaptic depression (ΔG < 0) was observed for Δt < 0. After successful experimental implementation of STDP-like learning for parylene-based memristive structures, a step forward has been done to demonstrate their utility in constructing simple neuromorphic networks. For this purpose, we have chosen the task of classical (also known as Pavlov’s dog) conditioning81, 90

page 248

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

(a)

249

(b)

Figure 8.24. STDP window of Cu/PPX/ITO memristive structures (for various initial conductance values) obtained with heteropolar (a) bi-rectangular and (b) bi-triangular spike pulses shown in the figure insets. Postsynaptic spikes were applied after (before) pre-synaptic ones with a varying delay time Δt. Every point of the curves is a median of 10 recorded experimental values. Reprinted from Ref. [89] licensed under a Creative Commons Attribution (CC BY) license.

and constructed a network consisting of two presynaptic neurons connected with a postsynaptic one (Figure 8.25(a)). The first presynaptic neuron is connected to the postsynaptic one via a resistor R corresponding to an unconditioned stimulus (e.g., “food”) pathway. The second presynaptic neuron connection is represented by a memristive element corresponding to an initially neutral stimulus (e.g., a “bell”) pathway. Each neuron was implemented in software: the presynaptic neurons were programmed to generate spikes of amplitude Usp (the spike shape was similar to that in the inset of Figure 8.24(b)), and the postsynaptic one was used as a threshold unit (generating spikes only in the case of the total input current exceeding the threshold current Ith , which is chosen to be slightly less than the ratio Usp /R). The bottom electrode of the memristive element was connected to the output of the postsynaptic neuron. This electronic implementation of Pavlov’s dog is similar to that in Ref. [91], where constant-signal learning without the use of any STDP-like rules was applied. Another implementation was

page 249

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

250

(a)

(b)

Figure 8.25. STDP-like learning memristive Pavlov’s dog implementation. (a) The electrical schematic diagram: N1 — the 1st pre-neuron, spiking after the “food”-related stimulus; N2 — the 2nd pre-neuron, spiking after the “bell” stimulus; N3 — the post-neuron, which spikes when the total input current exceeds the threshold; R — a resistor with a constant resistance value R = 2 kΩ; M — a memristive element, initially in the Roff = 20 kΩ resistive state. A postspike is generated unconditionally after a spike comes from N 1 and under the condition that the memristor current exceeds Ith after a spike comes from N2. (b) A tipical spike pattern applied to the inputs of the scheme: 1 — the initial pulse (1st Epoch) on the resistor (R) (unconditioned stimulus), resulting in postspike (P) 2, which in turn comes to the memristor (M) as pulse 3 (dashed) in the inverted form; 4 — the pulse on the memristor, initially without post-neuron activity; 5 — simultaneous pulses on the resistor and the memristor, which result in post-spike 6 leading to the teaching pulse 7 (dashed); 8 — a post-spike as a result of the conditioned stimulus when the training is completed (Epoch n, where n is equal to or above the number of epochs needed for successful conditioning). Reprinted from Ref. [89], licensed under a Creative Commons Attribution (CC BY) license.

proposed in a network with pseudo-memcapacitive synapses with a Hebbian-like learning mechanism.92 The learning procedure involved three steps: (1) application of a signal only to the unconditioned stimulus pathway (by this step we check the correct postsynaptic neuron activity, that is, the dog starts “salivating”, having been exposed to “food”); (2) application of a signal only to the conditioned stimulus pathway (in this step we check whether an initially neutral stimulus becomes a conditioned one); (3) pairing the two stimuli (in this step the conditioning (learning)

page 250

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

251

occurs). These three steps constituted one epoch of learning shown schematically in Figure 8.25(b). Therefore, it has been successfully demonstrated that memristive devices are capable of learning (including learning using biologically inspired STDP-like rules). These organic memristive devices exhibit the advantages of low switching voltage (down to 1 V), high Roff /Ron ratio (up to 104 ), long retention time (≥ 104 s) and multilevel resistance switching (at least 16 stable resistive states). It has been experimentally shown that these memristive elements can be trained by the spike learning mechanism. The model of classical conditioning (electronic “Pavlov’s dog”) was implemented as a simple neuromorphic circuit using organic memristive devices. 8.5.3. Towards synapse prosthesis A synapse is a biological structure which connects two neurons enabling specific and unidirectional information flow (excitation or inhibition) from one neuron to the other. Synaptic connections are the key elements of the neuronal networks and their plasticity underlies learning and memory. Restoration of synaptic connections as in the case of traumatic injury as well as in other pathologies associated with a synaptic loss of function could be solved through an introduction of electronic synapses to connect neurons directly, given that these artificial synapses recapitulate the main feature of natural synapses including their plasticity. Moreover, development of electronic synapses with unprecedented, due to biological restraints, features during evolution could result in the creation of cyborgs with unprecedented capacities. In this part of the work we will consider first steps toward the experimental realization of synapse prosthesis, connecting two live neural cells from rat cortex.93 Patch-clamp recordings from nonconnected pairs of cortical layer 5 pyramidal neurons in rat brain slices (Figure 8.26(a)) were used.93 Action potentials (APs) generated by suprathreshold depolarizing current injection in either neuron failed to generate any response in the other cell in the pair (Figure 8.26(c)), indicating that these

page 251

August 3, 2021

252

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

V. Erokhin

(a)

(b)

(c)

(d)

(e)

(f)

Figure 8.26.

(Continued)

b4205-v2-ch08

page 252

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

253

←−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− Figure 8.26. Activity-dependent coupling of neurons by organic memristive device. (a) Infrared differential interference contrast microphotograph of a P7 rat brain slice with visually identified L5/6 neocortical cells (Cell 1,2) recorded simultaneously. (b) Simplified electrical scheme of two patch-clamp amplifier headstages (Patch 1,2): 1,3 — patch-clamp holding inputs; 2,4 — patch-clamp primary outputs; and an organic memristive device based circuit connecting two neurons. (c) and (d) Traces of current-clamp recordings from Cells 1 and 2 before (c) and after (d) organic memristive device-coupling. Traces 1–4 correspond to the inputs/outputs as labeled in b. Note that prior to coupling by memristor device (c) APs in either neuron failed to evoke responses in the other neuron, indicating that these cells were not connected by natural synapses. After connection of Cells 1 and 2 through organic memristive device (d), the efficacy of coupling progressively increases with each consecutive depolarizing step/AP in Cell1. 500 traces are aligned with suprathreshold depolarizing steps delivered to Cell 1. Bottom plot, organic memristive device resistance as a function of the sweep #. Dashed lines indicate the first sweep when Cell 2 started firing. (e) Corresponding plots of the activity-dependent change in spike probability in Cell 2 (top), spike delay of Cell 2 from Cell 1 (middle) and spike delay jitter in Cell 2 (bottom). (f) Histogram of the spike delay in Cell 2 from Cell1 calculated for three organic memristive device-coupled cell pairs (777 spikes). Reproduced with permission c 2020 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim. from Ref. [93]

cells were not connected by natural synapses in both directions. These neurons were then connected through an electronic circuit with an organic memristive device, playing the role of a synapse analog (Figure 8.26(b)). After setting the memristive device resistance initially at high values by negative voltage loading (Figure 8.26(d), bottom), APs in a “presynaptic” Cell 1 (Figure 8.26(d), plot 2) were induced by a suprathreshold depolarizing steps (Figure 8.26(d), plot 1). However, these APs in Cell 1 resulted only a subthreshold depolarizing response in the “postsynaptic” Cell 2 (Figure 8.26(d), plot 4) due to high initial memristive device resistance (Figure 8.26(d), bottom panel and Figure 8.26(e), top panel). Since the resistance reduces upon depolarization,24 the consecutive depolarizing steps and the APs in Cell 1 induced a gradual increase in voltage responses after the memristive device (Figure 8.26(d), plot 3) and in Cell 2 (Figure 8.1(d), plot 4). When the memristive device resistance decreased by a factor ≈2 (Figure 8.26(d), bottom, sweep #113), the depolarizing response in Cell 2 reached the AP threshold (≈−40 mV)

page 253

August 3, 2021

254

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

and Cell 2 started reliably firing APs (Figure 8.26(d), plot 4, sweep #113 onwards and Figure 8.26(e)). Along with a further decrease in organic memristive device resistance (Figure 8.26(d), bottom), the firing probability of Cell 2 gradually increased (Figure 8.26(e), top). The activity-dependent increase in spike coupling between neurons was also associated with an improvement in the spike-timing through the organic memristive device synapse as evidenced by a progressive reduction in the AP delays (Figure 8.26(d), plot 4 and Figure 8.26(e), middle plot), and a reduction in the jitter of AP delays (Figure 8.26(d), plot 4 and Figure 8.26(e), bottom plot).94, 95 Figure 8.26(f) shows the histogram summarizing the delay times of Cell 2 spiking from Cell 1 for three cell pairs (777 spikes recorded). It is noteworthy that the characteristic timing of the AP commutation through the organic memristive device synapse is similar to that of natural excitatory synapses.96 To further assess the synaptic response, it was examined also whether organic memristive device coupling could enable neuronal synchronization during spontaneous activity. To this end, presynaptic Cell 1 was continuously depolarized by injection of constant inward current to allow spontaneous firing. It was observed that the firing of Cell 1 has induced a gradual decrease in organic memristive device resistance which in turn induces an increase in Cell 2 responses as in the previously described experiments. As soon as the suprathreshold level of cell coupling was achieved, Cell 2 started firing APs in synchrony with Cell 1 with about 3.8 ms time delay. Synchronized firing of the neurons, coupled through organic memristive device, occurred in the δ-frequency range (about 0.56 Hz) that is characteristic of the slow-wave cortical activity during deep sleep.97 In this study, the experimental evidence of the unidirectional, activity-dependent, coupling of live neurons through the organic memristive device has been provided. It has been demonstrated that the spike-timing features of the artificial synapse, based on organic memristive device, approach those of the natural excitatory synapses, that the magnitude of such coupling can be controlled by the neuronal activity, and that these artificial synapses efficiently

page 254

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

255

support neuronal synchronization in a simple two-neuron network. In addition, there are important indications that organic memristive devices, apart from being key candidate elements for neuromorphic computational systems,24 should also be considered as suitable elements for developing “synapse prosthesis,” and useful for neuromorphic alternative computational systems, where the same elements will be used for memorizing and processing information. 8.5.4. Stochastic self-organized computational systems One of the most important features of organic materials is their capability to be organized into 3D self-assembled systems (what is impossible in the inorganic world). In this part of the work we will consider how this capability was used for the realization of 3D system with adaptive electrical properties.98 Development of a system capable of performing complex information processing, learning and decision making implies the requirement for a very large number of elements (1014 –1015 synapses in the human brain). A direct approach to such a level of integration cannot be based on modern lithography. Even though significant progress has been achieved in this direction, currently only planar systems can be manufactured (even in the best cases of the current state of the art only few (eight) layered structures can be fabricated). In contrast, in the brain, active neurons and connections between them form 3D structures, allowing the formation of multiple adaptive signal pathways.99 Therefore, recently developed bottom-up approaches based on self-assembly processes can be considered as an alternative method to construct complex adaptive networks.100 Moreover, only the use of organic materials allows self-organization of such systems. In the case of inorganic memristive devices, lithography techniques are the only available approach to constructing networks with high levels of integration. First attempts to construct 3D adaptive networks were done using fibbrilar systems,101, 102 using also supports with porous and developed structure.32, 33, 39 Even if some interesting results have been obtained, stochastic self-organized networks have revealed better properties.

page 255

August 3, 2021

256

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

In this part of the work we will consider the development of a 3D stochastic network of memristive devices using phase separation of specially synthetized block copolymers, containing a solid electrolyte (polyethylene oxid) and insulating parts (poly(styrene sulfonic acid), PANI and gold nanoparticles.103, 104 The formation of a 3D selfassembled network was confirmed by microscopic studies and by a series of electrical measurements showing the connections that can occur only in 3D structures.98 A block copolymer poly(styrene sulfonic acid)-b-poly-(ethylene oxide)-b-poly(styrene sulfonic acid) (PSS-b-PEO-b-PSS) was prepared following the protocol for a block of PEO obtaining higher molecular weights to ensure greater morphological stability of the 3D matrix assembly. Four Cr electrodes were deposited on a glass or any other insulating support. A layer of the composite material, containing block copolymer, PANI and gold nanoparticles, was deposited on this support and mechanically patterned in order to make connections between two pairs of diagonal electrodes in a crossed configuration. A ring made from adhesive Kapton 36 mm thick film was placed over the crossed area and PEO gel containing Li+ and H+ ions was deposited within the area restricted by the ring. Three silver wires, acting as reference (gate) electrodes, were in contact with the PEO gel, taking special attention to prevent an electrical contact of these wires with the active layer. The area was protected by Kapton film in order to prevent the system from degradation. The presence of gold nanoparticles was due to the significant difference of the gold and PANI work functions that results in the fact that these particles behave as threshold elements (there is no barrier for the income current, but there is the barrier for exiting the particle).105 A scheme of the system for the electrical characterization of experiments, mimicking learning, is shown in Figure 8.27(a). Typical characteristics for ionic (b) and electronic (c) currents are also shown in Figure 8.27.98 Training of the system was performed in order to induce highly conducting pathways between one diagonal pair of input–output

page 256

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications (a)

257

(b)

(c)

Figure 8.27. Scheme of the system used for the learning experiments (a) and typical cyclic voltage–current characteristics for ionic (b) and electronic (c) conductivity measured between each input–output pair. Maximum (at about +0.5 V) and minimum (at about +0.1 V) of the ionic current correspond to the oxidation and reduction potentials of PANI, respectively. As a result, the increase or decrease of electronic conductivity is observed. The presence of hysteresis indicates the memory effect in the system. Reproduced from Ref. [98] with permission from The Royal Society of Chemistry.

electrodes (Figure 8.27(a); In1 –Out1 ) and to inhibit the conductivity between the other one (In2 –Out2 ). Two different modes of training, namely sequential and simultaneous training algorithms, were applied. In the case of sequential training, a voltage of +0.8 V between one pair of electrodes (In1 –Out1 ) was initially applied. When the conductivity variation reached saturation, this pair of electrodes was disconnected and −0.2 V was applied between the other pair of electrodes (In2 –Out2 ). To test the training results, values of the current between both pairs of electrodes were analysed by applying a voltage of +0.4 V. This value was chosen because no change in the conductivity state of the device can occur at this potential. Test results, measured immediately after training and 2 h later, are summarized in Table 8.2. These tests indicate the successful training of the system. The conductivity ratio between reinforced and inhibited pairs of electrodes is about 70.

page 257

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

258

Table 8.2. Test results obtained after the application of the first and the second training procedures in the case of sequential and simultaneous training between In1 –Out1 and In2 –Out2 pairs of electrodes. Training First Training

In1 –Out1 In2 –Out2

Second Training

In1 –Out1 In2 –Out2

Sequential mode, I (nA)

Simultaneous mode, I (nA)

After After After After

training 2h training 2h

400 370 7 5

200 250 60 20

After After After After

training 2h training 2h

170 250 150 120

60 40 300 370

Note: First training was done with the aim to reinforce the conductivity between In1 –Out1 and to inhibit the conductivity between In2 –Out2 pairs of electrodes. The second training was aimed to invert the conductivity between these pairs of electrodes.

As a next step, a reversed training procedure was applied aiming to reinforce the conductivity between the In2 –Out2 pair of input– output electrodes (which was previously inhibited), and to inhibit conductivity between the In1 –Out1 pair of electrodes. Duration of the training was the same as in the first case. However, it was not possible to reverse the conductivity states of these signal pathways: conductivity was only slightly decreased between the In1 –Out1 pair, and only slightly increased between the In2 –Out2 pair. These results are also summarized in the Table 8.2. Test measurements were performed immediately after training and after a delay of 1 and 2 days (no manipulation of the sample was performed during these periods before testing). It is interesting to note that in this case the conductivity between the In1 –Out1 pair of the electrodes increases again over time without any external influence on the system. Moreover, after 2 days its conductivity returned to the value that was observed before the second training procedure was started (400 nA current when +0.4 V is applied). The next experiment consisted of applying similar training to pathways between both pairs of electrodes simultaneously. For

page 258

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

259

this reason, both output electrodes were connected to the ground potential and voltages of different polarity and value (the same as in the previous case) were applied to the input electrodes. Testing of the efficiency of the training procedure was done similarly to the sequential training case. Test results were measured immediately after training and after a delay of 2 h (Table 8.2). Test experiments demonstrated that also in this case the training procedure was successful. Immediately after training the conductivity ratio between reinforced and inhibited pathways was about 4, but it increased with time to about 12 after 2 h. This indicates the existence of internal processes in the matrix even in the absence of external stimuli. Repeated training procedures aiming to invert the conductivity of the channels were also applied in the case of simultaneous training. Test results were measured immediately after training and 2 h later (Table 8.2). The results showed that, in this case, conductivity inversion in both channels was successful. However, results during the described training of the compositebased matrix can be obtained, in principle, even with a 2D system. Instead, high conductivity between In1 –Out2 and In2 –Out1 (diagonal pairs), and low conductivity between In1 –Out1 and In2 –Out2 (lateral pairs), is absolutely impossible with a 2D matrix, because the elements forming a planar array will be involved in both lateral and diagonal connections. Such a task requires the formation of signal pathways in different 2D planes. Thus, if the conductivity between lateral electrodes is decreased, while at the same time it is increased between diagonal electrodes (or vice versa) as a result of training, this would be direct proof of the 3D nature of the realized matrix. The results are presented in Table 8.3. The training procedure has resulted in the reinforcement of the connection between diagonal elements and has inhibited lateral connections. As stated above, such connections are impossible in 2D systems. Therefore, the results presented above can be considered as direct proof that the system we have developed is a real phase-separated matrix with a 3D structure. It is tempting to make some comparisons of the performance of the described system with that of the brain. In contrast to computers,

page 259

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

260

Table 8.3. Test results after modified training, reinforcing the conductivity between diagonal pairs of electrodes and inhibiting it between lateral ones. Connections In1 –Out2 In1 –Out1 In2 –Out1 In2 –Out2

Registered current (A) 3.1 × 10−6 10.2 × 10−9 3.6 × 10−6 0.28 × 10−6

human and other vertebrate brains do not represent a system with fully identical architecture. Even though, for example, mammalian brains, including humans, share a common global architecture, each individual brain differs in its detailed connectivity. Therefore, despite a lot of predetermined structures, the brain can be considered to some extent as a system with a stochastic distribution of nervous cells and connections between them. This is particularly true for the mammalian cerebral cortex, our main learning device. The connections are constituted by synapses, the plasticity of which makes learning processes possible. Learning is responsible for the functional “structuring” of the brain by strengthening and/or inhibiting signal pathways. Our sequential training can be likened to “baby learning” (imprinting). In childhood we establish strong associations between different phenomena which may be stable practically for the whole lifetime (imprinting). Stable learning also demands the concentration on one type of association over a significant period of time. In contrast, our simultaneous training can be compared to a daily life situation, where we need to resolve particular problems according to the external stimuli and accumulated experience. This will lead to short-term associations that can be varied (unless continuously repeated) by variations in the configuration of the external stimuli. In the brain, the mechanism for “indelible memory”106 is based on elimination of connections on the one hand and strengthening of the remaining ones on the other hand: in various parts of the nervous system an overshoot in the number of connections occurs at an early age, followed by later pruning of axonal branches or synapses. The degree of pruning depends on the richness of the environment to

page 260

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

261

which the animal is exposed: more pruning takes place after sensory deprivation within a critical period.106–109 Thus, learning during the critical period is very stable since it contributes to the formation of the mature anatomical network. Later learning is then restricted to the strengthening of synapses within the established stable network. A qualitative illustration of the learning of 3D network, composed from stochastic connections of organic memristive devices,110 involved the formation of stable long-term signal pathways, as well as short-term connections in a dynamic equilibrium, following the variations of the input stimuli and accumulated experience.

8.6. Conclusions In this chapter we have considered specific features of the organic memristive devices — synapse mimicking elements that can be used as building blocks of neuromorphic circuits and systems. In conclusion, we want to underline the following advantages of the described devices. In the case of hardware realization of artificial neuronal networks, the very important feature of polyaniline-based memristive devices is the correlation of the optical properties with the conductivity state. This feature is extremely important for the realization of multilayer artificial neuronal networks (perceptrons) as it allows to monitor the strengths of the synaptic connections and their variations in real time without disturbing the system and changing these weights during reading stages. In neuromorphic systems the described elements have demonstrated their applicability as synapse mimicking devices in electronic circuits, reproducing classical Hebbian and STDP-like learning. In addition, the devices were used as simplified synapse prosthesis, coupling nervous cells from rat brain and allowing signal transmission from one to the other similarly as synapses do. Finally, only in the case of organic memristive devices it is possible to use self-assembling for the organization of complex 3D systems, which properties can be established according to the applied supervised training and unsupervised learning algorithms.

page 261

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

262

b4205-v2-ch08

V. Erokhin

References 1. L. O. Chua, Memristor — the missing circuit element. IEEE Trans. Circuit Theor. 18, 507–519 (1971). 2. D. B. Strukov, G. S. Snider, D. R. Steward, and R. S. Williams, The missing memristor found. Nature 453, 80–83 (2008). 3. S. Vongehr and X. Gross, The missing memristor has not been found. Sci. Rep. 5, 877–88111657 (2015). 4. V. A. Demin and V. V. Erokhin, Hidden symmetry shows what a memristor is, Int. J. Unconven. Comput. 12, 433–443 (2016). 5. Y. V. Pershin and M. Di Ventra, A simple test for ideal memristors. J. Phys. Appl. Phys. 52, 01LT01 (2019). 6. Y. V. Pershin and M. Di Ventra, On the validity of memristor modeling in the network literature, Neural Networks. 121, 52–56 (2020). 7. F. Corinto, P. P. Civalleri and L. O. Chua, A theoretical approach to memristor devices. IEEE J. Emerg. Select Topics Circ. Syst. 5, 123–132 (2015). 8. X. Cao, X. M. Li, X. D. Gao, W.D. Yu, X. J. Liu, Y. W. Zhang, L. D. Chen, and X. H. Cheng, Forming-free coloccal resistive switching effect in rare-earth-oxide Gd2O3 films for memristor applications. J. Appl. Phys. 106, 073723 (2009). 9. F. Argal, Switching phenomena in titanium oxide thin films. Solid State Electron. 11, 535–541 (2010). 10. M. Cavallini, Z. Hemmatian, A. Riminucci, M. Preziozo, V. Morandi, and M. Murgia, Rerenerable resistive switching in silicon oxide based nanojunctions. Adv. Mater. 24, 1197–1201 (2012). 11. A. Younis, D. Adnan, and S. Li, Oxygen level: The dominant of resistive switching characteristics in cerium oxide thin films. J. Phys. D Appl. Phys. 45, 355101 (2012). 12. Z. B. Yan and J.-M. Liu, Coexistence of high performance resistance and capacitance memory based on multilayered metal-oxide structures. Sci. Rep. 3, 2482 (2013). 13. Y. Yang, S. H. Choi, and W. Lu, Oxide heterostructure resistive memory. Nano Lett. 13, 2908-2915 (2013). 14. E. Gale, R. Mayne, A. Adamatzky, and B. de Lacy Costello, Drop-coated titanium dioxide memristors. Mater. Chem. Phys. 143, 524–529 (2014). 15. Y. Aoki, C. Wiemann, V. Feyer, H.-S. Kim, C. M. Schneider, H. Ill-Yoo, and M. Martin, Bulk mixed ion electron conduction in amorphous gallium oxide causes memristive behavior. Nat. Commun. 5, 3473 (2014). 16. A. Younis, D. Chu, and S. Li., Evidence of filamentary switching in oxidebased memory devices via weak programming and retention failure analysis, Sci. Rep. 5, 13599 (2015). 17. M. Prezioso, F. M. Bayat, B. Hoskins, K. Likharev, and D. Strukov, Selfadaptive spike-time-dependent plasticity of metal-oxide memristors. Sci. Rep. 6, 21331 (2016).

page 262

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

263

18. Z. M. Liao, C. Hou, Q. Zhao, D. S. Wang, Y. D. Li, and D. P. Yu, Resistive switching and metallic-filament formation in Ag2S nanowire transistors. Small. 5, 2377–2381 (2009). 19. Y. C. Yang, P. Gao, S. Gaba, T. Chang, X. Q. Pan, and W. Lu, Observation of conducting filament growth in nanoscale resistive memories. Nat. Commun. 3, 732 (2012). 20. J. Y. Chen, C. L. Hsin, C.W. Huang, C. H. Chiu, Y. T. Huang, S. J. Lin, W. W. Wu, and L. J. Chen, Dynamic evolution of conducting nanofilament in resistive switching memories. Nano Lett. 13, 3671–3677 (2013). 21. H. Lv, X. Xu, P. Sun, H. Liu, Q. Luo, Q. Liu, W. Banerjee, H. Sun, S. Long, L. Li, and M. Liu, Atomic view of filament growth in electrochemical memristive elements. Sci. Rep. 5, 13311 (2015). 22. S. La Barbera, D. Vuillaume, and F. Alibart, Filamentary switching: Synaptic plasticity through device volatility. ACS Nano. 9, 941–949 (2015). 23. I. Valov, E. Linn, S. Tappertzhofen, S. Schmelzer, J. van den Hurk, F. Lentz, and R. Waser, Nanobatteries in redox-based resistive switches require extension of memristor theory. Nature Commun. 4, 1771 (2013). 24. V. Erokhin and M. P. Fontana, Thin film electrochemical memristive systems for bio-inspired computation. J. Comput. Theor. Nanosci. 8, 313– 330 (2010). 25. V. Erokhin, T. Berzina, and M. P. Fontana, Hybrid electronic device based on polyaniline-polyethylenoxide junction. J. Appl. Phys. 97, 054501 (2005). 26. V. Braitenberg, Vehicles: Experiments in Synthetic Psychology (MIT Press Cambridge, MA 1984). 27. M. Chen, D. Nilsson, T. Kugler and M. Berggren, Electric currents rectification by an all-organic electrochemical device. Appl. Phys. Lett. 81, 2011 (2002). 28. E. T. Kang, K. G. Neoh, and K. L. Tan, Polyaniline: A polymer with many interesting intrinsic redox states, Progr. Polymer. Sci. 23, 277–324 (1998). 29. T. Berzina, A. Smerieri, G. Ruggeri, M. Bernabo’, V. Erokhin, and M. P. Fontana, Role of the solid electrolyte composition on the performance of a polymeric memristor. Mater. Sci. Eng. C. 30, 407–410 (2010). 30. T. Berzina, S. Erokhina, P. Camorani, O. Konovalov, V. Erokhin, and M. P. Fontana, Electrochemical control of the conductivity in an organic memristor: A time-resolved X-ray fluorescence study of ionic drift as a function of the applied voltage, ACS Appl. Mater. Interfaces 1, 2115–2118 (2009). 31. S. Battistoni, A. Verna, S. L. Marasso, M. Cocuzza and V. Erokhin, On the interpretation of hysteresis loop for electronic and ionic currents in organic memristive devices. Physica Status Solidi A 1900985 (2020). 32. S. Erokhina, V. Sorokin, and V. Erokhin, Skeleton-supported stochastic networks of organic memristive devices: Adaptations and learning. AIP Adv. 5, 027129 (2015). 33. S. Erokhina, V. Sorokin, and V. Erokhin, Polyaniline-based organic memristive device fabricated by layer-by-layer deposition technique. Electron. Mater. Lett. 11, 801–805 (2015).

page 263

August 3, 2021

264

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

34. T. Berzina, A. Smerieri, M. Bernabo’, A. Pucci, G. Ruggeri, V. Erokhin, and M. P. Fontana, Optimization of an organic memristor as an adaptive memory element. J. Appl. Phys. 105, 124515 (2009). 35. A. Smerieri, V. Erokhin, and M. P. Fontana, Origin of current oscillations in a polymeric electrochemically controlled element. J. Appl. Phys. 103, 094517 (2008). 36. F. Pincella, P. Camorani, and V. Erokhin, Electrical properties of an organic memristive system. Appl. Phys. A 104, 1039–1046 (2011). 37. V.A. Demin, V.V. Erokhin, P.K. Kashkarov, and M.V. Kovalchuk, Electrochemical model of the polyaniline based organic memristive device. J. Appl. Phys. 116, 064507 (2014). 38. A. Smerieri, T. Berzina, V. Erokhin, and M. P. Fontana, Polymeric electrochemical element for adaptive networks: Pulse mode. J. Appl. Phys. 104, 114513 (2008). 39. V. Erokhin, T. Berzina, A. Smerieri, P. Camorani, S. Erokhina, and M. P. Fontana, Bio-inspired adaptive networks based on organic memristors. Nano Commun. Networks. 1, 108–117 (2010). 40. A. Adamatzky, V. Erokhin, M. Grube, T. Schubert, and A. Schumann, Physarum chip project: Growing computers from slime mould. Int. J. Unconvent. Comput. 8, 319–323 (2012). 41. A. Cifarelli, A. Dimonte, T. Berzina, and V. Erokhin, Non-linear bioelectronic element: Schottky effect and electrochemistry. Int. J. Unconven. Comput. 10, 375–379 (2014). 42. A. Cifarelli, T. Berzina, and V. Erokhin, Bio-organic memristive device: Polyaniline — Physarum polycephalum interface. Physica Status Solidi C 12, 218–221 (2015). 43. A. Romeo, A. Dimonte, G. Tarabella, P. D’Angelo, V. Erokhin, and S. Iannotta, A bio-inspired memory device based on interfacing Physarum polycephalum with an organic semiconductor, APL Mater. 3, 014909 (2015). 44. G. Tarabella, P. D’Angelo, A. Cifarelli, A. Dimonte, A. Romeo, T. Berzina, V. Erokhin, and S. Iannotta, A hybrid living/organic electrochemical transistor based on the Physarum polycephalum cell endowed with both sensing and memristive properties. Chem. Sci. 6, 1067–1075 (2015). 45. V. Erokhin, T. Berzina, P. Camorani, and M.P. Fontana, On the stability of polymeric electrochemical elements for adaptive networks. Colloids Surfaces A 321, 218–221 (2008). 46. D. A. Lapkin, A. V. Emelyanov, V. A. Demin, V. V. Erokhin, L. A. Feigin, P. K. Kashkarov, and M. V. Kovalchuk, Polyaniline-based memristive microdevice with high switching rate and endurance. Appl. Phys. Lett. 112, 043302 (2018). 47. E. Juzekaeva, A. Nasretdinov, S. Battistoni, T. Berzina, S. Iannotta, R. Khazipov, V. Erokhin, and M. Mukhtarov, Coupling cortical neurons through electronic memristive synapse, Adv. Mater. Technol. 4, 1800350 (2019).

page 264

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

265

48. Y. E. Syu, T. C. Vhang, J. H. Lou, T. M. Tsai, K. C. Chang, M. J. Tsai, Y. L. Wang, M. Liu, and S. M. Sze, Atomic-level quantized reaction of HfOx memristor. Appl. Phys. Lett. 102, 172903 (2013). 49. A. Wedig, M. Luebben, D.-Y. Cho, M. Moors, K. Skaja, V. Rana, T. Hasegawa, K. A. Adepalli, B. Yildiz, R. Waser, and I. Valov, Nanoscale cation motion in TaOx , HfOx and TiOx memristive systems. Nat. Nanotech. 11, 67–74 (2016). 50. H. Jiang, L. Han, P. Lin, Z. Wang, M. H. Jang, Q. Wu, M. Barnell, J. J. Yang, H. L. Xin, and Q. Xia, Sub-10 nm Ta channel responsible for superior performance of a HfO2 memristor. Sci. Rep. 6, 28525 (2016). 51. W. F. He, H. J. Sun, Y. X. Zhou, K. Lu, K. H. Xue, and X. S. Miao, Coustomized binary and multi-level HfO2−x -based memristors tuned by oxidation conditions. Sci. Rep. 7, 10070 (2017). 52. V. Erokhin, G.D. Howard, and A. Adamatzky, Organic memristor devices for logic elements with memory. Int. J. Bifurcation and Chaos. 22, 1250283 (2012). 53. G. Baldi, S. Battistoni, G. Attolini, M. Bosi, C. Collini, S. Iannotta, L. Lorenzelli, R. Mosca, J.S. Ponraj, R. Verucchi, and V. Erokhin, Logic with memory: AND gates made of organic and inorganic memristive devices. Semicond. Sci. Technol. 29, 104009 (2014). 54. F. Rosenblatt, The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65, 386–408 (1958). 55. F. Rosenblatt, Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanism (Spartan Books, Washington DC, 1961). 56. P. D. Wasserman, Neural Computing: Theory and Practice (Van Nostrand Reinhold, New York, 1989). 57. V.A. Demin. V. Erokhin, A.V. Emelyanov, S. Battistoni, G. Baldi, S. Iannotta, P.K. Kashkarov, and M.V. Kovalchuk, Hardware elementary perceptron based on polyaniline memristive devices. Org. Electr. 25, 16–20 (2015). 58. G. Indiveri, B. Linares-Barranco, T.J. Hamilton, A. van Schaik, R. EtienneCummings, T. Delbruck, S.-C. Liu, P. Dudek, P. Hafliger, S. Renaud, J. Schemmel, G. Cauwenberghs, J. Arthur, K. Hynna, F. Folowosele, S. Saighi, T. Serrano-Gotarredona, J. Wijekoon, Y. Wang, and K. Boahen, Neuromorphic silicon neuron circuits. Front. Neurosci. 5, 73 (2011). 59. E. Farquhar and P. Hasler, A bio-physically inspired silicon neuron, IEEETrans. Circuits and Systems I: Regular Papers. 52, 477–488 (2005). 60. A. V. Emelyanov, D. A. Lapkin, V. A. Demin, V. V. Erokhin, S. Battistoni, G. Baldi, A. Dimonte, A. N. Korovin, S. Iannotta, P. K. Kashkarov, and M. V. Kovalchuk, First step towards the realization of a double layer perceptron based on organic memristive devices. AIP Adv. 6, 111301 (2016). 61. P. J. Werbos, The Roots of Backpropagation. From Ordered Derivatives to Neural Networks and Political Forecasting. (John Wiley and Sons, Inc, New York, 1994).

page 265

August 3, 2021

266

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

62. S. Saighi, C. G. Mayr, T. Serrano-Gotarredona, H. Schmidt, G. Lecerf, J. Tomas, J. Grollier, S. Boyn, A. F. Vincent, D. Querlioz, S. La Barbera, F. Alibart, D. Vuillaume, O. Bichler, C. Gamrat, and B. Linares-Barranco, Plasticity in memristive devices for spiking neural networks. Front. Neurosci. 9, 51 (2015). 63. S. Battistoni, A. Dimonte, and V. Erokhin, Spectrophotometric characterization of organic memristive devices. Org. Electr. 38, 79–83 (2016). 64. A. Dimonte, F. Fermi, T. Berzina, and V. Erokhin, Spectral imaging method for studying Physarum polycephalum growth on polyaniline surface. Mater. Sci. Engineering C. 53, 11–14 (2015). 65. S. Nabavi, R. Fox, C. D. Proulx, J. Y. Lin, R. Y. Tsien and R. Malinow, Engineering a memory with LTD and LTP. Nature. 511, 348–352 (2014). 66. M. F. Yeckel, A. Kapur, and D. Johnston, Multiple forms of LTP in hippocampal CA3 neurons use a common postsynaptic mechanism. Nature Neurosci. 2, 625–633 (1999). 67. S. M. Dudek and M. F. Bear, Homosynaptic long-term depression in area CA1 of hippocampus and effect of-N-methyl-D-aspartate receptor blockade. In How We Learn; How We Remember: Toward an Understanding of Brain and Neural Systems: Selected Papers of Leon N. Copper, (World Scientific, 1995). 68. R. M. Mullkey and R. C. Malenka, Mechansisms underlying induction of homosynaptic long-term depression in area CA1 of the hippocampus. Neuron. 9, 967–975 (1987). 69. S. Battistoni, V. Erokhin, and S. Iannotta, Frequency driven organic memristive devices for neuromorphic short term and long term plasticity. Org. Electr. 65, 434–438 (2019). 70. A. Smerieri, T. Berzina, V. Erokhin, and M. P. Fontana, A functional polymeric material based on hybrid electrochemically controlled junctions. Mater. Sci. Eng. C 28, 18–22 (2008). 71. V. Erokhin, T. Berzina, P. Camorani, A. Smerieri, D. Vavoulis, J. Feng, and M.P. Fontana, Material memristive device circuits with synaptic plasticity: Learning and memory, BioNanoScience 1, 24–30 (2011). 72. P. R. Benjamin, K. Staras, and G. Kemenes, A systems approach to the cellular analysis of associative learning in the pond snail Lymnaea. Leaning and Memory 7, 124–131 (2000). 73. K. Staras, G. Kemenes, and P. R. Gross, Pattern-generating role for motoneurons in a rhythmically active neuronal network. J. Neurosci. 18, 3669–3688 (1998). 74. V. A. Straub and P. R. Benjamin, Extrinsic modulation and motor pattern generation in a feeding network: a cellular study. J. Neurosci. 21, 1767–1778 (2001). 75. M. S. Yeoman, A. W. Pieneman, G. P. Ferguson, A. Ter Maat, and P. R. Benjamin, Modulatory role for the serotonergic cerebral giant cells in the feeding system of the snail, Lymnaea I. Fine wire recording in the intact animal and pharmacology. J. Neurophysiol. 72, 1357–1371 (1994).

page 266

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

267

76. D. V. Vavoulis, V. A. Straub, I. Kemenes, G. Kemenes, J. Feng, and P. R. Benjamin, Dynamic control of a central pattern generator circuit: a computational model of the snail feeding network. J. Neurosci. 25, 2805– 2818 (2007). 77. E. S. Nikitin, D. V. Vavoulis, I. Kemenes, V. Marra, Z. Pirger, M. Michel, J. Feng, M. O’Shea, P. R. Benjamin and G. Kemenes, Persistent sodium current ia a nonsynaptic substrate for long-term associative memory. Curr. Biol. 18, 1221–1226 (2008). 78. D. V. Vavoulis, E. S. Nikitin, I. Kemenes, V. Marra, J. Feng, P. R. Benjamin, and G. Kemenes, Balanced plasticity and stability of the electrical properties of a molluscan modulatory interneuron after classical conditioning: a computational study. Frontiers Behav. Neurosci. 4, 19 (2010). 79. I. Kemenes, V. A. Straub, E. S. Nikitin, K. Staras, G. Kemenes, and P. R. Benjamin, Role of delayed nonsynaptic neuronal plasticity in longterm associative memory. Curr. Biol.. 16, 1269–1279 (2006). 80. C. H. Bailey, M. Giustetto, Y. Y. Huang, R. D. Hawkins and E. R. Kandel, Is heterosynaptic modulation essential for stabilizing Hebbian plasticity and memory? Nat. Rev. Neurosci. 1, 11–20 (2000). 81. M. Ziegler, R. Soni, T. Patelczyk, M. Ignatov, T. Bartsch, P. Meuffels, and H. Kohlstedt, An electronic version of Pavlov’s dog. Adv. Func. Mater. 22, 2744–2749 (2012). 82. Z. Wang and X. Wang, A novel memristor-based circuit implementation of full-function Pavlov associative memory accorded with biological feature, IEEE Trans. Circuits Systems I Regular Papers. 65, 2210–2220 (2018). 83. L. Chen, C. Li, and Y. Chen, A Forgetting memristive spiking neural network for Pavlov experiment. Int. J. Bifurcation Chaos. 28, 1850080 (2018). 84. M. Kumar, S. Abbas, J.-H. Lee, and J. Kim, Controllable digital resistive switching for artificial synapses and pavlovian learning algorithm, Nanoscale. 11, 15596–15604 (2019). 85. D. A. Lapkin, A. V. Emelyanov, V. A. Demin, T. S. Berzina, and V. V. Erokhin, Spike-timing-dependent plasticity of polyaniline-based memristive element. Microelectro. Eng. 185–186, 43–47 (2018). 86. G. Q. Bi and M. M. Poo, Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type. J. Neurosci. 18, 10464–10472 (1998). 87. M. Prezioso, F. M. Bayat, B. Hoskins, K. Likharev and D. Strukov, Selfadaptive spike-time-dependent plasticity of metal-oxide memristors. Sci. Rep. 6, 21331 (2016). 88. A. A. Minnekhanov, B. S. Shvetsov, M. M. Martyshov, K. E. Nikiruy, E. V. Kukueva, M. Y. Presnyakov, P. A. Forsh, V. V. Rylkov, V. V. Erokhin, V. A. Demin, and A. V. Emelyanov, On the resistive switching mechanism of parylene-based memristive devices. Org. Electr. 74, 89–95 (2019). 89. A. A. Minnekhanov, A. V. Emelyanov, D. A. Lapkin, K. E. Nikiruy, B. S. Shvetsov, A. A. Nesmelov, V. V. Rylkov, V. A. Demin, and V. V. Erokhin, Parylene based memristive devices with multilevel resistive switching for neuromorphic applications. Sci. Rep. 9, 10800 (2019).

page 267

August 3, 2021

268

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

V. Erokhin

90. I. P. Pavlov, In Experimental Psychology and Psychopathology in Animals. In Lectures on Conditioned Reflexes, Vo. 1, (International Publishers, 1928), pp. 47–60. 91. M. Ziegler, R. Soni, T. Patelczyk, M. Ignatov, T. Bartsch, P. Meuffels and H. Kohlstrdt, An electronic version of Pavlov’s dog. Adv. Func. Mater. 22, 2744–2749 (2012). 92. Z. Wang, M. Rao, J.-W. Han, J. Zhang, P. Lin, Y. Li, C. Li, W. Song, S. Asapu, R. Midya, Y. Zhuo, H. Jiang, J. H. Yoon, N. K. Upadhayay, S. Joshi, M. Hu, J. P. Strachan, M. Barnell, Q. Wu, H. Wu, Q. Qiu, R. S. Williams, Q. Xia, and J. j. Yang, Capacitive neural network with neuro-transistors. Nature Commun. 9, 3208 (2018). 93. E. Juzekaeva, A. Nasretdinov, S. Battistoni, T. Berzina, S. Iannotta, R. Khazipov, V. Erokhin, and M. Mukhtarov, Coupling cortical neurons through electronic memristive synapse. Adv. Mater. Technol. 4, 1800350 (2019). 94. P. Gkoupidenis, N. Schaefer, X. Strakosas, J. A. Fairfield, and G. G. Malliaras, Synaptic plasticity functions in an organic electrochemical transistor. Appl. Phys. Lett. 107, 263302 (2015). 95. S. J. Etherington, S. E. Atkinson, G. J. Stuart, and S. R. Williams., Synaptic Integration in eLS (John Wiley and Sons Ltd, New Jersey, 2010). 96. D. Fricker and R. Miles, EPSP amplification and the precision of spike timing in hippocampal neurons. Neuron. 28, 559–569 (2000). 97. V. Buzsaki and A. Draguhn, Neuronal oscillations in cortical networks. Science 25, 1926–1929 (2004). 98. V. Erokhin, T. Berzina, K. Gorshkov, P. Camorani, A. Pucci, L. Ricci, G. Ruggeri, R. Sigala, and A. Schuz, Stochastic hybrid sD matrix: learning and adaptation of electrical properties, J. Mater. Chem. 22, 22881–22887 (2012). 99. V. Erokhin, A. Schuz, and M. P. Fontana, Organic memristor and bioinspired information processing. Int. J. Unconvent. Comput. 6, 15–32 (2009). 100. J.-M. Lehn, From supramolecular chemistry towards constitutional dynamic chemistry and adaptive chemistry. Chem. Soc. Rev. 36, 151–160 (2007). 101. V. Erokhin, T. Berzina, P. Camorani, and M. P. Fontana, Conducting polymer — solid electrolyte fibrillar composite material for adaptive networks. Soft Matter. 2, 870–874 (2006). 102. Y. N. Malakhova, A. N. Korovin, D. A. Lapkin, S. N. Malakhov, V. S. Shcherban, E. B. Pichkur, S. N. Yakunin, V. A. Demin, S. N. Chvalun, and V. Erokhin, Planar and 3D fibrous polyaniline-based materials for memristive elements. Soft Matter. 13, 7300–7306 (2017). 103. T. Berzina, A. Pucci, G. Ruggieri, V. Erokhin, and M. P. Fontana, Gold nanoparticles-polyaniline composite material: Synthesis, structure and electrical properties. Synth. Met. 161, 1408–1413 (2011). 104. T. Berzina, K. Gorshkov, A. Pucci, G. Ruggeri, and V. Erokhin, LangmuirSchaefer films of a polyaniline — gold nanoparticle composite material for applications in organic memristive devices. RSC Adv. 1, 1537–1541 (2011).

page 268

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch08

Organic Memristive Devices for Bio-inspired Applications

269

105. V. Erokhin, M.K. Ram, and O. Yavuz, The New Frontiers of Organic and Composite Nanotechnologies (Elsevier, Oxford, Amsterdam, 1960). 106. J. W. Lichtman and H. Colman, Synapse elimination and indelible memory. Neuron. 25, 269–278 (2000). 107. G. M. Innocenti and D. O. Gross, The postnatal development of visual callosal connections in the absence of visual experience or of the eyes. Exp. Brain Res. 39, 365–375 (1980). 108. D. Purves and J. W. Lichtman, Elimination of synapses in the developing nervous system. Science 210, 153–157 (1980). 109. R. Apfelbach and E. Weiler, Olfactory deprivation enhances normal spine loss in the olfactory bulb of developing ferrets. Neurosci. Lett. 62, 169–173 (1985). 110. V. Erokhin, On the learning of stochastic networks of organic memristive devices. Int. J. Unconvent. Comput. 9, 303–310 (2013).

page 269

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch09

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235740 0009

Chapter 9

On Wave-Based Majority Gates with Cellular Automata Genaro J. Mart´ınez∗ , Andrew Adamatzky† , Shigeru Ninagawa‡ and Kenichi Morita§ Escuela Superior de C´ omputo, Instituto Politécnico Nacional, México Unconventional Computing Lab, University of the West of England, Bristol, UK Kanazawa Institute of Technology, Kanazawa, Japan Hiroshima University, Hiroshima, Japan ∗ [email protected] † [email protected] ‡ [email protected] § [email protected] We demonstrate a discrete implementation of a wave-based majority gate in a chaotic Life-like cellular automaton. The gate functions via controlling of patterns’ propagation into stationary channels. The gate presented is realisable in many living and non-living substrates that show wave-like activity of its space–time dynamics or pattern propagation. In the gate a symmetric pattern represents a binary value 0 while a non-symmetric pattern represents a binary value 1. Origination of the patterns and their symmetry type are encoded by the particle reactions at the beginning of computation. The patterns propagate in channels of the gate and compete for the space at the intersection of the channels. We implement 3-inputs majority gates using a W topology showing additional implementations of 5-inputs majority gates and one tree (cascade) majority gate. 271

page 271

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

272

b4205-v2-ch09

G. J. Mart´ınez et al.

9.1. Introduction Recent years have shown a growing interest to physical implementations of majority gates. There are three examples. First one is a spintronic device — spin-wave majority gate — which is of a low space complexity and consumes ultra-lower power. Fischer et al.1 demonstrate a microwave device that can be constructed from majority gates using a trident topology. The device uses the interference of spin-waves, these waves are synchronized patterns of electron spin. The second example is the plasmonic majority gate, which propagates between a metal and a dielectric device. Dutta et al.2 implement plasmonic majority gates in a nanoscale cascadable photonic media. The information is encoded in the amplitude and phase of electric waves intensity. Cellular automata computations they have been implemented frequently by atomic signals, inspired by von Neumann proof of universality of automaton with 29 cell-states.3 Computation via reactions between particles — gliders or mobile localizations — was stimulated with the popular Conway’s Game of Life where a lot of complex patterns emerge and interact.4 On the other hand, computation via competing patterns was proposed in Ref. [5]. Life-like family of rules are discrete analog of sub-excitable media and the Game of Life is not the unique rule with complex behavior in this domain as was shown by Eppstein in Ref. [6]. There is a family of Life-like rules, where “cells never die” that means that the state “1” is an absorbing state. One of them is the family of Life without Death (LwD), studied by Griffeath and Moore.7 In the LwD automaton we can observe propagation of patterns, formed due to rule-based restrictions on propagation similar to that in subexcitable chemical media and slime mould Physarym polycephalum.8 The LwD family of cell-state transition rules is an automaton equivalent of the precipitating chemical systems as was discussed in a phenomenological study of semi-totalistic and precipitating cellular automata.9 In the precipitating cellular automata channels are constructed and activated by interactions of particles that imitate a propagation of thousands of live organisms across the channels, competing for the space. In Refs. [5,10], we have exploited topologies as T or X11 implemented to construct adders.12 To explore the

page 272

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch09

On Wave-Based Majority Gates with Cellular Automata

273

potential of such devices, we drawn analogues with spintronic and plasmonic devices with majority gates.13 In the present chapter, we will adapt our previous designs with a W topology. Advantages to use majority gates with respect to serial and traditional gates include the reduction of operations and space, increase the speed and perform parallel computation. Step-by-step some circuits designed with nand and xor gates are improved with majority gates. For example, quantum dot cellular automata exploit frequently majority gates to design circuits.14–16 Previously we have implemented a half-adder and a binary adder-based majority gates with W topology.12 This chapter is organized as follows. Section 9.1 introduces the state of art of the importance of majority gates and cellular automata propagating information. Section 9.2 gives the base function (cellular automaton), description of the rule, global dynamics and statistical characteristics. Section 9.3 explains how the computation using majority gates is implemented by competing patterns. Section 9.4 presents discussions and conclusions. 9.2. Propagation Patterns in Life-like Rules Life-like rules domain displays a number of complex functions, some of them without travelling localisations. In Refs. [5, 10] we presented rules with chaotic behavior but patterns playing a role of “walls” and stopping the “chaotic” universe’s expansion. In this study, we focus on the evolution rule B2/S2345, known also as R(2, 5, 2, 2) in cellular automata literature. The evolution rule B2/S2345 is described as follows. Each cell x ∈ Σ takes two states “0” (“dead”) and “1” (“alive”), and updates its state depending on its eight V closest neighbors: (1) Birth: A central cell xi,j in state 0 at time step t takes state 1 at time step t + 1 if it has exactly two neighbors in state 1, ΣV−1 i=0 x = 2. (2) Survival: A central cell xi,j in state 1 at time t remains in the state 1 at time t+1 if it has two, three, four or five live neighbors, ΣV−1 i=0 x = 2|3|4|5. (3) Death: All other local situations.

page 273

August 3, 2021

274

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch09

G. J. Mart´ınez et al.

Once a resting lattice is perturbed (few cells are assigned live states), patterns of states 1 emerge, grow and propagate on the lattice quickly. The rule B2/S2345 is classed as a chaotic function. The global behavior of rule B2/S2345 can be described by a mean field polynomial and its fixed points. Mean field theory is a proven technique for describing statistical properties of cellular automata without analyzing evolution spaces of individual rules. The method assumes that elements of the set of states are independent, uncorrelated between each other in the rule’s evolution space. Therefore we can study probabilities of states in a neighborhood in terms of probability of a single state (the state in which the neighborhood evolves), thus probability of a neighborhood is the product of the probabilities of each cell in the neighborhood. McIntosh characterized a chaotic cellular automata with mean field approximation in Ref. [17]. If the density plot crosses the diagonal and it has no tangencies then the function is chaotic. The mean field curve plotted in Figure 9.1(a) displays a stable fixed point in f (x) = 0.0476 that implies that a slow number of active cells will grow quickly, doubling the number of alive cells each two steps (Figure 9.1(b)), reaching the maximum concentration of alive cells f (x) = 0.4682 where this maximum point is very close to the stable fixed point in f (x) = 0.468. It is quite interesting that this plot has similarities with the Game of Life polynomial where we have positive and negative tangencies related to complex behavior of the rule. The mean field polynomial for rule B2/S2345 is as follows: pt+1 = 14p2t qt3 (4p4t + 2qt4 + 5p3t qt + 2pt qt3 + 4p2t qt2 ).

(9.1)

where pt is the probability of a cell being in state “1” at time step t and its complement qt = (1 − pt ) is the probability of the cell to be in state “0” at time step t. Let us explore some specific initial conditions in details. Figure 9.2(a) shows the evolution of the initial configuration 11 11 after 320 generations with a population of 104,000 alive cells. The 3D projection is oriented along x-axis and thus the history of the evolution starts with an inclination of 45◦ . The base of this pyramid is the last step where the global behavior shows a periodic pattern that

page 274

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch09

On Wave-Based Majority Gates with Cellular Automata

275

(a)

(b)

Figure 9.1. (a) Shows the mean field approximation curve for the rule B2/S2345. Axis x determines the probability to produce 0s in the next generation and axis y determines the probability to produce 1s in the next generation. The rule has an unstable fixed point in 0.0476 and a stable fixed point in 0.468. (b) The plot displays the density of alive cells in a space of 700 × 700 cells starting from a random initial condition to 4%, the density is fitted by a polynomial of order two.

page 275

August 3, 2021

276

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch09

G. J. Mart´ınez et al.

(a)

(b)

Figure 9.2. Life-like rule B2/S2345 starting with a few cells in a 3D projection. (a) The evolution of a block of four alive cells. Evolution is periodic and symmetric expanding forever. The projection is on the x-axis after 320 steps. (b) L-pentomino initial configuration displays an interesting global behavior where chaos and periodic patterns coexist. The projection is on the y-axis after 300 steps.

page 276

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch09

On Wave-Based Majority Gates with Cellular Automata

277

expands symmetrically forever.Figure 9.2(b) displays the evolution later of 300 generations with a of a L-pentomino configuration 1000 1111 population of 95,048 alive cells. This evolution has a 3D orientation on the y-axis. The evolution displays a non-trivial pattern where both emerge periodic and chaotic regions coexist fighting and expanding forever in the space, it is originated by the asymmetric formation of cells from the L-pentomino. In this sense, the rule can be classified as a complex rule and not chaotic. At the same time, the evolution shows how some macro-cells emerge during the construction and they work as walls where a region has no communication with others. This feature is fundamental to construct channels of information to process binary data. From random initial conditions we explore the universe with values close to the unstable fixed point. Figure 9.3 shows the history when the automaton evolution starts with an initial condition at 4% of active cells on a space of 700 × 700 squares. The first snapshot illustrates the first 30 steps where we can see how some few complex patterns emerge, such as still life, oscillators and particle patterns. As any chaotic function, the initial conditions are sensitive to small perturbations and can exploded into chaos. The second snapshot shows the history of 60 steps where particles travel at the speed of light 1c and nucleations expand quickly and collide with other nucleations. For other densities of initial conditions it is impossible to observe such complex patterns. The third snapshot displays the final state of the global configuration where a mix of periodic and chaotic regions reach a stability. Therefore, this pattern is the result of collisions of particles and collisions of nucleations with a population of 264,578 alive cells after 190 steps. The Game of Life shows 1/f noise in the evolution starting from random condition.18 On the other hand, B2/S2345 shows the Lorentzian spectrum (Figure 9.4(a)) in which the power is flat at frequencies lower than some frequency determined by the time constant of the system. That implies that B2/S2345 has a finite relaxation time in its behaviour. This is caused by the characteristic behavior of B2/S2345 that the patterns are fixed after the nucleations have finished.

page 277

August 3, 2021

278

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch09

G. J. Mart´ınez et al.

(a)

(b)

(c)

Figure 9.3. Landscapes starting from random initial conditions to 4% into a space of 700 × 700 cells. (a) After 30 steps we can see how nucleation of cells explodes but at the same time few mobile and stationary particles travel in the evolution space. (b) After 60 steps the chaos quickly expands along the y-axis. (c) The final global state where fixed configurations cover the whole evolution space.

page 278

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch09

On Wave-Based Majority Gates with Cellular Automata 0.1

0.1

0.01

0.01

0.001

0.001

0.0001

0.0001

1e-05

1e-05

1e-06

1e-06

279 f^(-1.87)

1e-07

1e-07 1

10

100

(a)

1000

1

10

100

(b)

Figure 9.4. Power spectra calculated from the evolution starting from random condition (a) at 1% over 1,024 steps and (b) 128 steps. They are calculated in a square area of 50 × 50 cells among 700 × 700 cells. The x-axis is the frequency f , y-axis is power. The solid line in the right is the least square fitting of the data in the range of f = 1 ∼ 10.

The power spectra, however, differ in areas where the diffusive waves propagate in the channels of majority gate. The exponent of the power spectra is about −2, on condition that the observed steps are not long; for example, 128 steps. This means that the behavior of each cell is similar to Brownian motion. The power spectra with exponent close to −2 are observed also in the evolution from random configuration provided that the observed steps are not long (Figure 9.4(b)). These results suggest that the behavior of B2/S2345 is essentially similar to Brownian motion from the viewpoint of spectral analysis. The most significant difference between B2/S2345 and Life is the existence or non-existence of “sheath” along the pathway of the signal. While in Life the signals are substantiated by bare propagating patterns and they move ahead in vacuous space, in B2/S2345 the pathway of the signal is covered with sheath constructed with stationary structures. Brownian-motion like power spectra are observed in another sheath-type CA, Langton’s selfreproducing loop as well.19 The existence or non-existence of sheath covering the pathway of signal might make a difference between Brownian motion and 1/f noise.

page 279

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch09

G. J. Mart´ınez et al.

280

9.3. MAJORITY Gates by Competing Patterns Majority gates by competing patterns in cellular automata were introduced in Refs. [5, 10, 12] in X topology. In this section, we will explore how W topology will work. So, first we explore the universe of the Life-like rule B2/S2345. Figure 9.5 shows the very small universe of complex patterns that emerge in the Life-like rule B2/S2345. We saw in Figure 9.2 that small perturbations related to its unstable fixed point mean field polynomials induce chaotic behavior. Four primitive complex patterns are one type of a still life, two types of oscillators and one particle/glider. Table 9.1 shows basic properties of these complex patterns. Particularly the rule B2/S2345 has an indestructible pattern composed by concatenation of four still life patterns forming a symmetric still life (Figure 9.6). This pattern is very useful for our constructions because we can design wires where patterns shall propagate inside. To construct a wire we need to define the minimum size where a colony of indestructible patterns does not form new active cells yet is conducive to propagation of particles. The particle has a size of

(a)

(b)

(c)

(d)

Figure 9.5. Primitive patterns emerging in Life-like rule B2/S2345. (a) still life, (b) blinker, (c) oscillator, (d) particle.

Table 9.1.

Properties of primitive structures in Life-like rule B2/S2345.

Structure

Mass

Volume

Period

Displacement

Speed

Still life Blinker Oscillator Particle

5 2 3 4

32 22 32 3×4

1 2 2 1

0 0 0 1

0 0 0 1/c (speed of light)

page 280

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch09

On Wave-Based Majority Gates with Cellular Automata

281

Figure 9.6. Indestructible pattern in B2/S2345. It is a still life configuration composed by concatenation of four primitive still life patterns (Figure 9.5a).

(a)

(b)

Figure 9.7. (a) Minimal space to construct stable patterns formed by colonies of indestructible still life configurations. (b) The minimal space where a channel can be designed and particles can travel without perturbing the medium.

four cells while the indestructible still life has a size of six cells, thus 6 mod 4 = 2 yields the number of combinations where the particle can be encoded in a wire without exploding in chaos. Figure 9.7(a) shows the minimum space between indestructible patterns to construct a colony of them (like an agar configuration) and they cannot be disturbed from any reaction. A particle preserves the same distance of two cells to travel freely without producing more information Figure 9.7(b). If two particles are traveling continually they shall preserve the distance of two cells to avoid an undesirable reaction. Finally, the wire where patterns will propagate is defined by a size of three indestructible still life configurations.

page 281

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch09

G. J. Mart´ınez et al.

282

The easiest way to control patterns propagating in a nonlinear medium circuit is to constrain them geometrically. Constraining the media geometrically is a common technique used when designing computational schemes in spatially extended nonlinear media. For example “strips” or “channels” are constructed within the medium (e.g., excitable medium) and connected together. Fronts of propagating phase (excitation) or diffusive waves represent signals, or values of logical variables. When fronts interact at the junctions some fronts annihilate or new fronts emerge. The propagation in the output channels represent results of the computation. Boolean values are represented by the position of particles, positioned initially in the middle of channel, value 0, or slightly offset, value 1 (Figure 9.8(a)). The initial positions of the particles determine outcomes of their reaction. Particle, corresponding to the value 0 is transformed to a regular symmetric pattern, similar to frozen waves of excitation activity. Particle, representing signal value 1, is transformed to transversally asymmetric patterns (Figure 9.8(b)). Both patterns propagate inside the channel with constant, advancing a unit of channel length per step of discrete time and patterns repeat every 16 steps. Minsky describes the majority gate with disjunctive and conjunctive normal logical propositions for three inputs A, B, C in Ref. [20] as follows: M AJ (A, B, C) = (A ∧ B) ∨ (A ∧ C) ∨ (B ∧ C),

(a)

(9.2)

(b)

Figure 9.8. Wires constructed with indestructible still life patterns in Life-like rule B2/S2345. (a) The basic structure of an empty channel with a particle initialized to encoded a symmetric (up) and asymmetric pattern (down). (b) The symmetric pattern propagating as a wave on the channel represents value 0 (up) and the asymmetric pattern represents value 1 late (down) of an excitation derived from a particle collision.

page 282

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch09

On Wave-Based Majority Gates with Cellular Automata

283

Table 9.2. True table for a three-input majority gate. 3-input majority gate Input 000 001 010 011

Output

Input

Output

0 0 0 1

100 101 110 111

0 1 1 1

where the result is precisely the most frequent value on such variables. This way, we have that for three inputs |Σ|3 is the number of outputs, as we can see in Table 9.2. Conventional and, or, not gates are easy to implement with competing patterns, see Ref. [5]. However, when implementing nonserial logical gates it is more convenient to work with majority gates, as they can be extrapolated to hot ice computers11 or slime mould in a more realistic way,8 and also to molecular computing,21 spintronic technology1, 22 or plasmonic devices.2 Particularly, these kinds of gates are used to design a diversity of circuits-based majority gates in quantum dot cellular automata, see Refs. [14–16]). As has been demonstrated the use of majority gates increase the speed and performance of computation in novel and updated algorithms, see Ref. [23]. A majority gate in B2/S2345 in W topology can be described as follows. The gate has three parallel orthogonal inputs: West channels, and one output: East channel. Three propagating patterns, which represent inputs, collide at the cross-junction of the gate. The resultant pattern is recorded at the output channel, as is illustrated in Figure 9.9. Figure 9.10 shows the implementation of 3-input majority gates with W topology by competing patterns using the Life-like cellular automaton B2/S2345. The left side (Figure 9.10(a)) displays the initial condition where every particle defines binary values for each input and the right side (Figure 9.10(b)) displays the result after 178 generations.

page 283

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch09

G. J. Mart´ınez et al.

284

Figure 9.9. A scheme to implement a W architecture for 3-input majority gate in the cellular automaton Life-like rule B2/S2345. The length of channels are defined internally (with still life patterns) by 72 cells for every input and the central channel that includes the output 138 cells, the size is defined by 10 cells. In average, after 185 generations the computation is done. The whole volume of this device is defined by a square of 150 × 102 cells.

(a)

Figure 9.10.

(b)

Implementation of a 3-input majority gate by competing patterns.

page 284

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch09

On Wave-Based Majority Gates with Cellular Automata

285

Figure 9.11. Implementation of a 5-input majority gate by competing patterns in B2/S2345.

One can also implement a 5-input majority gate by competing patterns. This gate can be expressed with disjunctive and conjunctive normal logical propositions for five inputs A, B, C, D, E as follows: M AJ (A, B, C, D, E) = (A ∧ B ∧ C) ∨ (A ∧ B ∧ D) ∨ (A ∧ B ∧ E)∨ (A ∧ C ∧ D) ∨ (A ∧ C ∧ E) ∨ (A ∧ D ∧ E)∨ (B ∧ C ∧ D) ∨ (B ∧ C ∧ E) ∨ (B ∧ D ∧ E)∨ (C ∧ D ∧ E).

(9.3)

Figure 9.11 displays the implementation of 5-input majority gate in B2/S2345. With this number of inputs we have |Σ|5

page 285

August 3, 2021

286

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch09

G. J. Mart´ınez et al.

operations. The evolutions are symmetric and therefore it is not necessary to calculate the whole set of gates. This way, the input M AJ(0, 1, 0, 0, 0) is symmetric to its codification and representation to the input M AJ(0, 0, 0, 1, 0) and so on. This 5-input majority gate architecture is implemented in a volume of 150 × 183 cells and every gate needs five particles to start the reaction and the propagation of patterns. Of course, increasing the number of n-inputs in majority gates might lead to problems with synchronization of collisions. This is somewhat reflected in designs of quantum dot cellular automata where adders are constructed from 5-input majority gates.24, 25 We can implement a cascade majority gate by competing patterns initialized with three 3-input, shown in Figure 9.12. It is

Figure 9.12. Tree majority gate MAJ (MAJ (1, 0, 1), MAJ (1, 0, 0), MAJ (0, 1, 1)) → 1. The device is implemented in a space of 334 × 326 cells with 9,516 active cells, nine particles and two gates delay. After 457 steps we obtain the final results with a population of 21,629 alive cells.

page 286

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch09

On Wave-Based Majority Gates with Cellular Automata

287

composed by three 3-input majority gates initialized with nine particles. In the process two gates delay become necessary to synchronize the collision of patterns in the last 3-input majority gate that give the final result. This tree majority gates calculate specifically the inputs M AJ(M AJ(1, 0, 1), M AJ(1, 0, 0), M AJ(0, 1, 1)) → 1. These types of gates are implemented in plasmonic devices.2 9.4. Final Notes Computation by competing patterns is another unconventional way to design computers handling dozens, hundred, thousand or millions of organisms to interpret binary values in wires as fragments of wave propagation in a discrete space. A future work will be to implement other non-serial gates by competing patterns and develop other circuits with 5-input majority gates and in cascade. References 1. T. Fischer, M. Kewenig, D. Bozhko, A. Serga, I. Syvorotka, F. Ciubotaru, C. Adelmann, B. Hillebrands, and A. Chumak, Experimental prototype of a spin-wave majority gate. Appl. Phys. Lett. 110(15), 152401 (2017). 2. S. Dutta, O. Zografos, S. Gurunarayanan, et al. Proposal for nanoscale cascaded plasmonic majority gates for non-Boolean computation. Sci. Rep. 7, 17866 (2017). https://doi.org/10.1038/s41598-017-17954-2 3. J. von Neumann (edited and completed by A. W. Burks), Theory of Selfreproducing Automata (University of Illinois Press, Urbana, 1966). 4. P. Rendell, Turing Machine Universality of the Game of Life (Springer, 2016). 5. G. J. Mart´ınez, A. Adamatzky, K. Morita, and M. Margenstern, Computation with competing patterns in life-like automaton. In Game of Life Cellular Automata (Springer, 2010), pp. 547–572. 6. D. Eppstein, Growth and decay in life-like cellular automata. In Game of Life Cellular Automata (Springer, 2010), pp. 71–97. 7. D. Griffeath and C. Moore, Life without death is p-complete. Comp. Syst. 10, 437–448 (1996). 8. A. Adamatzky, Physarum Machines: Computers from Slime Mould, vol. 74 (World Scientific, 2010). 9. A. Adamatzky, G. J. Mart´ınez, and J. C. Seck-Tuoh-Mora, Phenomenology of reaction–diffusion binary-state cellular automata. Int. J. Bifurc. Chaos. 16(10), 2985–3005 (2006). 10. G. J. Mart´ınez, A. Adamatzky, and B. D. L. Costello, On logical gates in precipitating medium: Cellular automaton model. Phys. Lett. A. 372(31), 5115–5119 (2008).

page 287

August 3, 2021

288

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch09

G. J. Mart´ınez et al.

11. A. Adamatzky, Hot ice computer. Phys. Lett. A. 374(2), 264–271 (2009). 12. G. J. Mart´ınez, K. Morita, A. Adamatzky, and M. Margenstern, Majority adder implementation by competing patterns in life-like rule b2/s2345. In Lecture Notes in Computer Science, vol. 6079 (Springer, 2010), pp. 93–104. 13. B. Dieny, I. L. Prejbeanu, K. Garello, P. Gambardella, P. Freitas, R. Lehndorff, W. Raberg, U. Ebels, S. O. Demokritov, J. Akerman et al., Opportunities and challenges for spintronics in the microelectronics industry. Nat. Electr. 3(8), 446–459 (2020). 14. I. Amlani, A. O. Orlov, G. Toth, G. H. Bernstein, C. S. Lent, and G. L. Snider, Digital logic gate using quantum-dot cellular automata. Science 284(5412), 289–291 (1999). 15. K. Navi, S. Sayedsalehi, R. Farazkish, and M. R. Azghadi, Five-input majority gate, a new device for quantum-dot cellular automata. J. Comput. Theoret. Nanosci. 7(8), 1546–1553 (2010). 16. G. Prakash, M. Darbandi, N. Gafar, N. H. Jabarullah, and M. R. Jalali, A new design of 2-bit universal shift register using rotated majority gate based on quantum-dot cellular automata technology. Int. J. Theoret. Phys. 58(9), 3006–3024 (2019). 17. H. V. McIntosh, Wolfram’s class iv automata and a good life. Physica D: Nonlinear Phenomena. 45(1–3), 105–121 (1990). 18. S. Ninagawa, M. Yoneda, and S. Hirose, 1/f fluctuation in the “game of life”, Physica D: Nonlinear Phenomena. 118(1–2), 49–52 (1998). 19. S. Ninagawa, Dynamics of self-reproducing cellular automata (in preparation). 20. M. Minsky, Computation: Finite and Infinite Machines (Prentice Hall, 1967). 21. J. Gao, Y. Liu, X. Lin, et al. Implementation of cascade logic gates and majority logic gate on a simple and universal molecular platform. Sci. Rep. 7, 14014 (2017). https://doi.org/10.1038/s41598-017-14416-7 22. I. P. Radu et al., “Spintronic majority gates,” 2015 IEEE International Electron Devices Meeting (IEDM), Washington, DC, USA, 2015, pp. 32.5.1– 32.5.4, doi: 10.1109/IEDM.2015.7409816. 23. V. Pudi, K. Sridharan, and F. Lombardi, Majority logic formulations for parallel adder designs at reduced delay and circuit complexity. IEEE Trans. Comput. 66(10), 1824–1830 (2017). 24. T. N. Sasamal, A. K. Singh, and A. Mohan, An optimal design of full adder based on 5-input majority gate in coplanar quantum-dot cellular automata. Optik. 127(20), 8576–8591 (2016). 25. L. Wang and G. Xie, Novel designs of full adder in quantum-dot cellular automata technology. The J. Supercomput. 74(9), 4798–4816 (2018).

page 288

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch10

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235740 0010

Chapter 10

Information Processing in Plants: Hormones as Integrators of External Cues into Plant Development M´ onica L. Garc´ıa-Gómez and George W. Bassel∗ School of Life Sciences, Gibbet Hill Road, University of Warwick, Coventry CV4 7AL, UK ∗ [email protected] Plants remain fixed in the ground where they are exposed to a constant stream of environmental signals that influence their development. The mechanism that integrates external information into coordinated plant development relies on an encoding and decoding mechanism in which hormones play a crucial role. Hormonal levels are regulated in response to external signals to encode this into information carrier molecules which can be understood and processed by plant cells. Subsequent decoding of this information is mediated by gene regulatory networks and signalling cascades which lead to altered gene regulation. As hormones regulate the levels and sensitivity of each other, this results in a self-regulatory mechanism underpinning plant development. In this chapter we discuss the role of hormones as molecular mediators of external information into plant developmental decisions, the consequences of self-organizing hormonal profile in plasticity, and the response to future environmental exposures.

10.1. Introduction Plants, as other organisms, are subject to complex and dynamic environmental conditions that may impose challenges for their survival. While animals have the option to change their location towards a more optimal one, once a seedling has been established, plants remain in that location for the remainder of their lives. 289

page 289

August 3, 2021

290

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch10

M. L. Garc´ıa-G´ omez and G. W. Bassel

The capacity to respond to the environment is achieved thanks to developmental plasticity which allows for the modulation of the timing at which key developmental transitions occur. The control of these morphogenetic transitions is heavily dependent on the environmental conditions experienced by the plants. Temperature, humidity, light quality, are so of the exogenous signals that simultaneously influence plant development.1 Moreover, these conditions may be constantly changing implying a dynamic information processing mechanism to integrate external information into plant development. This information processing mechanism requires that the complex information received is encoded into internal signals that can be understood by plant cells, for instance, in the form of hormones, signalling peptides, Ca2+ waves, among others. In principle, the variety of input signals that can be sensed by plants1 surpass that of the signalling molecules that encode them,2 implying a reduction of dimensionality into a molecular “language” that plants can understand and further manipulate. The complexity of the encoded messages could be further increased by the temporal dynamics creating signatures, for example, in the case of calcium signalling.3 10.2. Hormonal Encoding of Environmental Information The encoding of external information is fundamental for the organism to compute its state based on the conditions that are currently being experienced. And key for this, is the coordination between different parts of the plant, that may be subject to different local conditions. Hormones, are important internal signals used to encode information from the environment, that allow the transfer of information to many other cells that may not be exposed themselves. Indeed, hormones are information carrier molecules that can be locally synthetized in response to external stimulus, and then transported between cells. The intercellular transport of hormones is mediated by both specialized membrane transporters (Blilou), by the diffusion through plasmodesmata (PD) channels, or via de vascular system that mediates the long-range transfer of information. This short and

page 290

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Information Processing in Plants

b4205-v2-ch10

291

long-range communication allows the transmission of information between cells that are directly exposed to external conditions to those that execute a critical response. This implies a division of labor where some cells are responsible for sensing the external conditions and others of executing a response to withstand them. In this sense, cells act as information processing units and plant computation takes place in a multicellular context where hormones play a key role in coordinating the decisions of individual cells. Thus, hormone transport bridges molecular and tissue scale information processing for a coordinated organ’s response. An example of this multicellular information processing is the shade avoidance response that modulates the positioning of leaves to ensure light acquisition. This adaptative response involves the local synthesis of auxin in the cells at the tip of the leaf, and the subsequent transmission of this information through PD-mediated transport of auxin to the cells at the petiole that attaches the leaves to the stem. In the petiole, the incoming auxin elicit a downregulation of cytokinin levels and growth,4 resulting in an adjustment of the angle at which leaves are positioned. This information processing relies in changes in gene expression, intercellular communication, and cell growth dynamics, that are altogether necessary to respond to the light conditions experienced. The cell decisions that are taken in individual cells, namely the synthesis of auxin and the changes in cell growth, take place in opposite ends of the leaf and are coordinated via auxin transport for the benefit of the whole organ. The role of auxin transport in coordinating this division of labor is critical as disruptions in the long-range transport of auxin results in a failure to adjust the leaf angle to maximize their exposure to light.5 A second example of the key role of hormone transport in the coordination of organ responses is in the root growth homeostasis, where the long-range transport of auxin allows the coordination of the activity of the stem cell niche with the differentiation of its progeny.6 Particularly, the auxin synthesized in the stem cell niche6–8 is then transported basipetally by PIN efflux facilitators,9 so that it gets distributed in the root meristem reaching the transition zone where it promotes cytokinin signalling and the cessation of cell

page 291

August 3, 2021

292

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch10

M. L. Garc´ıa-G´ omez and G. W. Bassel

division.6 Interestingly, the transition zone has been proposed as a centre for the integration of external inputs into root development,10 in which auxin transport plays a fundamental role. In some instances, there is a need for the immediate transfer of information across the plant for a timely response. For example, the response to soil herbivores is mediated by the long-range communication from wounded sites in the root to the leaves.11 This systemic communication is mediated by self-propagating waves of reactive oxygen species (ROS) and Ca2+ production,12 that ultimately impinge on hormonal activity, by inducing JA responses in the leaves.11, 13 Given the pressing need for a fast response, the involvement of ROS and Ca2+ in this case could be related with the importance of the rate at which information is transmitted to have a significant impact on survival to the attack of an herbivore.12, 14 The decoding of hormonal levels is mediated by signalling pathways that can impact on the activity of genes that control cell growth, division or differentiation, thus turning hormonal information into complex developmental outputs. The signalling pathways that decode hormonal information consist of receptors that upon specific binding with a hormone elicit the activity of associated downstream signalling components, ultimately regulating gene expression. The location, magnitude and nature of outputs of a hormonal signalling pathway will be determined by the receptor type and abundance (sensitivity), and the particular signalling components present in a given cell. Although thinking of hormonal signalling pathways as independent from each other is convenient, in reality there are multiple points of regulatory crosstalk.15–20 Moreover, at any given time the pathways can be simultaneously active in a given cell. The mechanisms that allow plants to interpret cocktails of hormones rely on the signalling pathways acting in parallel or converging to the regulation of the same targets. Through the analysis of microarray data of seedlings treated with the various hormonal treatments it was described that hormonal signalling pathways preferentially act in parallel such that each hormone regulates the expression of different target genes.21 This parallel processing can be integrated by the transport of hormones to couple individual cell divisions to the tissue

page 292

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Information Processing in Plants

b4205-v2-ch10

293

scale, for a collective benefit of all cells. Although less frequent, in some cases the same genes are regulated by different hormones,21 thus implying a regulatory crosstalk in hormonal responses. In the case where hormonal responses converge on similar regulatory targets, the integration can be mediated by proteins whose activity can be interpreted as a readout of all of them. For instance, the DELLA proteins were originally identified as negative regulators of gibberellic acid (GA) responses, but since then it has been shown to be a signalling hub of other hormone signalling pathways, as well as of responses to environmental conditions such as warm temperature or shade.22 For instance, many developmental effects of ethylene, as the regulation of root growth and the transition to flowering, are mediated through DELLA.23, 24 Indeed, DELLA is a convergent point of auxin and ethylene in the regulation of the apical hook of germinating seedlings.23 Another example is the case of jasmonate signaling that is enhanced by gibberellic acid, and this is mediated by a direct interaction between DELLA and JAZ1, a negative regulator.17 On the other hand, JA delays the degradation of DELLA,25 thus establishing a negative feedback on its own signaling. DELLA also interacts directly with BZR1, a key transcription factor that regulates the expression of brassinosteroids responsive genes, impeding it to bind to its target genes,26, 27 and with D14 which is a strigolactone signaling component.28 Therefore, DELLA plays a critical role in the hormonal crosstalk and for the response to multiple signals. The information processing of many hormones at a given time can be further refined by the spatial separation of hormonal responses. Plant tissues are composed of differentiated cells specialized in computing the responses to a particular hormone. This compartmentalization of hormonal responses allows for a division of labor in the decoding of hormonal profiles. This is the case in the root meristem where hormonal responses are delimited to particular cell types,29 such that gibberellic acid responses are localized to the endodermis while brassinosteroids ones are confined to the epidermis.29 This spatial division of labor in hormonal responses is further refined by the expression patterns30–32 and affinities33, 34 of the signalling components of a particular hormone. This differential expression of

page 293

August 3, 2021

294

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch10

M. L. Garc´ıa-G´ omez and G. W. Bassel

signalling components allows cells to have different responses to the same signal, thus increasing the information processing capabilities of the multicellular system. An example of this is the auxin signalling pathway, including the TIR1 and AUX/IAA co-receptors and the ARF transcription factors that are non-homogenously expressed in the root and the shoot apical meristems of Arabidopsis thaliana.31, 32 As different AUX/IAA proteins have different binding affinities to auxin,33 then a specificity in auxin responses can be found in different cell types of a shoot and root meristems. This implies that different cells within an organ may be using different information processing mechanisms due to differences in the molecular constituents they express. This cell type specificity in response has also been documented for stress responses,35–38 linking the networks underlying cell fate with those responding to external information. This spatial separation of responses allows subsets of cells within an organ to be specialized in processing the levels of a particular hormone, or to eliciting a response that is necessary for the whole organism to withstand a given environmental condition. This comes at the price of the vulnerability in the information processing mechanism because of cells loss or damage, although the outstanding regenerative capacity of plants provides a mechanism to restore the patterns underlying spatial separation of hormonal responses in these cases. The advent of single cell sequencing coupled with fluorescence-activated cell sorting39 could allow plant scientists to gain more insight into how common is the cell type compartmentalization of information processing in plants, and their role in plant adaptation in challenging growth conditions. Analysing the expression of hormonal signalling components with this approach will reveal differences in responses in the cells in an organ, as well as the extent at which the division of labor strategy is found in plants. 10.3. Self-regulatory Hormonal Network Underlying Plasticity in Plant Development The constant stream of inputs from the environment that plants experience implies that information processing is a continuous

page 294

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Information Processing in Plants

b4205-v2-ch10

295

and dynamic process. At any given moment, hormones are being produced and destroyed in response to external signals, and this everchanging cocktail of internal signals is continuously processed and integrated into the regulatory networks to inform developmental decisions. Part of the responses that are elicit by hormones is the modulation of the levels of other hormones, thus establishing a selfregulatory hormonal regulatory system. Indeed, hormones regulate the expression of genes involved in the metabolism and signalling of other hormones6, 21, 25, 40, 41 such that a particular hormone can quantitatively modulate the sensitivity and levels of other hormones. This reciprocal regulation of hormonal metabolism and signalling is present in the examples described in the previous section, where auxin regulates cytokinin levels or signaling to control growth or division (Moubayidin et al., 2019).4 The topology of the regulatory network may not change, but the levels of its constituents can change quantitatively. This implies quantitative differences in the hormonal profile of plants, of how the system will process new external information, and thus the possible outputs. The existence of a self-regulatory hormonal network for plant the information processing establishes a mechanism by plants can encode information from previous exposures to an environmental signal in the form of quantitative variations in the levels of hormones. For instance, if a plant has experienced optimal conditions that are not sufficient to trigger a morphogenetic transition, this experience will be encoded in the plant as hormonal levels that will instruct the underlying information processing mechanisms. These differences may influence the outcome of future responses by facilitating the responsiveness of the system for a morphogenetic transition (Figure 10.1). The hormonal levels can be modulated gradually to respond favorably to future optimal conditions thus modulating the plant’s capacity to interpret the following external stimulus. The regulation of the sensitivity and hormonal levels by other hormones could be a underlying regulatory mechanism by which plants continuously modulate their perception of the environment. The hormonal profiles will be constantly changing in response to the environment, in some cases making the system more prone to

page 295

August 3, 2021

296

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch10

M. L. Garc´ıa-G´ omez and G. W. Bassel

Figure 10.1. Encoding and decoding of environmental information by complex regulatory networks and hormonal transport.

certain morphogenetic transitions. The regulation of hormonal levels by other hormones and their interaction with plant genetic developmental programs constitute an information processing mechanism for the integration of environmental information to plant architecture. In contrast to animals, plant development is very plastic and does not follow a predetermined program. Instead, the number of plant organs that will be developed is highly dependent on the previous conditions experienced. In this sense, plant development is a highly dynamic process that is continuously informed by the conditions experienced by the organism. The self-regulation of hormonal levels represents a mechanism by which information about the external world can be encoded in the form of varying hormone levels and sensibility. This in

page 296

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Information Processing in Plants

b4205-v2-ch10

297

turn would influence future organisms’ responses to the environment, thus shaping possible plant responses. 10.3.1. Information processing in the transition from dormancy to germination The transition from dormancy to germination is one of the most important decisions in the life of a plant as it defines the position where plants will be established. The environment informs this irreversible decision by controlling the levels of two hormones with antagonistic effects on germination; on the one hand abscisic acid (ABA) maintains the dormant state and gibberellic acid (GA) promotes germination.42 These two hormones mutually inhibit each other’s abundance by regulating the expression of enzymes involved in their synthesis and degradation.43 A dynamical model has been proposed to describe the metabolic regulation between ABA and GA.44 This circuit exhibits bistability such that it has a state with high ABA and low GA levels that corresponds to dormancy, and another state with low ABA levels and high GA levels that corresponds to germination. Altering the sensitivity of GA in the model resulted in changes in the size of the basins of attraction of each state, showcasing a modulation in the propensity to reach germination. There are several regulatory feedbacks that influence the dynamic behaviour of this regulatory circuit. For example, the positive and negative regulation of ABA abundance by its on signalling.44 This latter example constitutes an incoherent feedforward loop in which ABA has two opposing effects on the same target, ABA abundance, and that may confer the system with interesting dynamical properties like the capacity to respond to relative rather than absolute changes in signals.45 In the case of the seed, this motif was shown to be related to the generation of variability in the levels of ABA (Johnston et al., 2018), which may be important for a bet-hedging strategy of seeds to ensure robustness to unpredictable environmental conditions. This incoherent feedforward loop in ABA regulation implements a mechanism that explains the production of seeds with varying levels of ABA.46 This implies that from the beginning the seeds have the same developmental program initialized in different hormonal

page 297

August 3, 2021

298

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch10

M. L. Garc´ıa-G´ omez and G. W. Bassel

profiles, which will impact how seeds will respond to future environmental signals. Indeed, before germinating, seeds can encounter varying conditions of temperature, and light, among others, and that external information can be encoded by the plant through the regulation of ABA and GA levels. From their creation at the end of development, seeds exhibit differences in their hormonal profiles, the differences will be further amplified by the continuous encoding of external information. This continuous tuning of the hormonal profiles attenuates or amplifies hormonal responses, allowing seeds to constantly process external information to make an informed decision of whether to remain dormant or to germinate. An example of the role of the environment in the regulation of the hormonal profiles of seeds is cold temperature which stimulates germination in Arabidopsis seeds.47 This is mediated by the upregulation of enzymes mediated in GA biosynthesis,44 thus impacting on the activity of the bistable regulatory circuit discussed above. Experimental and computational simulations have shown that continuous exposure to cold treatments is not as successful in inducing germination as intermittent cold treatments.44 Therefore, the integration of information from previous exposure to cold treatment informs the morphogenetic transition to germinate. This can be understood as the continuous regulation of GA levels by cold, and the effect of such in the ABA/GA regulatory circuit. As other external inputs can change the levels of ABA and GA, the regulatory circuit described constitutes an integrator of environmental information into the developmental transition to germination. The continuous modulation of hormonal levels in seeds allows this system to constantly inform the cells about the environmental conditions that are being experienced by the plant, such that the decision to germinate is based on the conditions that had been experienced by the seeds. This plasticity on the timing of germination is fundamental for the future seedlings to be established in the most favourable setting. While the external conditions at a given moment may not be optimal enough, or may not last long enough to elicit the transition to germination, by changing the levels of ABA and GA, still the system can be primed towards dormancy or germination.

page 298

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Information Processing in Plants

b4205-v2-ch10

299

In this sense, the ABA–GA bistable circuit provides a mechanism to encode information about the suitability of external conditions into the morphogenetic transition from dormancy to germination.

10.4. Concluding Remarks Plant hormone signaling pathways are integrated in a complex and non-linear regulatory network, such that one particular hormone may impact on the output of the others. This self-regulatory hormonal network is a tool that plants use to encode their previous experiences, by altering the levels of individual hormones and signaling components, to inform future developmental responses. This information processing is continuous, and occurs at every cell of a plant, allowing them to constantly monitor their environment and respond appropriately. For this coordination, the encoding of information in the form of internal molecules, as hormones, that can that can move short and long distances is fundamental. The elucidation of the mechanisms that underlie the interaction of plants with the environment, and the key rol of hormones in this, allow us to understand how plants adapt to various conditions, and potentially could allow for the artificially design of responses of plants.

References 1. S. Gilroy and A. Trewavas, Signal processing and transduction in plant cells: The end of the beginning?. Nat. Rev. Mol. Cell Biol. 2(4), 307–314 (2001). 2. S. Aaron, L. I. A. Calderon-Villalobos, and M. Estelle, Plant hormones are versatile chemical regulators of plant growth. Nat. Chem. Biol. 5(5), 301–307 (2009). 3. J. H. Liu, J. Whalley, and M. R. Knight, Combining modelling and experimental approaches to explain how calcium signatures are decoded by calmodulin-binding transcription activators (CAMTA s) to produce specific gene expression responses. New Phytol. 208(1), 174–187 (2015). 4. C. Yang and L. Lin, Hormonal regulation in shade avoidance. Front. Plant Sci. 8, 1527 (2017). 5. C. Gao et al., Directionality of plasmodesmata-mediated transport in Arabidopsis leaves supports auxin channeling. Current Biology (2020). 6. L. Moubayidin et al., Spatial coordination between stem cell activity and cell differentiation in the root meristem. Develop. Cell 26(4), 405–415 (2013).

page 299

August 3, 2021

300

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch10

M. L. Garc´ıa-G´ omez and G. W. Bassel

7. H. Tian et al., WOX5–IAA17 feedback circuit-mediated cellular auxin response is crucial for the patterning of root stem cell niches in Arabidopsis. Molecular Plant 7(2), 277–289 (2014). 8. S. Gonzali et al., A turanose-insensitive mutant suggests a role for WOX5 in auxin homeostasis in Arabidopsis thaliana. The Plant J. 44(4), 633–645 (2005). 9. I. Blilou et al., The PIN auxin efflux facilitator network controls growth and patterning in Arabidopsis roots. Nature 433(7021), 39–44 (2005). 10. F. Baluˇska et al., Root apex transition zone: A signalling–response nexus in the root. Trends Plant Sci. 15(7), 402–408 (2010). 11. Q. Shao et al., Two glutamate-and pH-regulated Ca2+ channels are required for systemic wound signaling in Arabidopsis. Sci. Signaling 13, 640 (2020). 12. S. Gilroy et al., ROS, calcium, and electric signals: Key mediators of rapid systemic signaling in plants. Plant Physiology 171(3), 1606–1615 (2016). 13. A. Kumari et al., Arabidopsis H+-ATPase AHA1 controls slow wave potential duration and wound-response jasmonate pathway activation. Proc. Nat. Acad. Sci. 116(40), 20226–20231 (2019). 14. Y. S. Fichman, I. Zandalinas, and R. Mittler, Untangling the ties that bind different systemic signals in plants. Sci. Signaling 13, 640 (2020). 15. S. Gazzarrini and P. McCourt, Cross-talk in plant hormone signalling: What Arabidopsis mutants are telling us. Annals of Botany 91(6), 605–612 (2003). 16. E. J. Chapman and M. Estelle, Cytokinin and auxin intersection in root meristems. Genome Biol. 10(2), 1–5 (2009). 17. X. Hou et al., DELLAs modulate jasmonate signaling via competitive binding to JAZs. Developmental Cell 19(6), 884–894 (2010). 18. S. Depuydt and C. S. Hardtke, Hormone signalling crosstalk in plant growth regulation. Curr. Biol. 21(9), R365–R373 (2011). 19. X. Cheng, C. Ruyter-Spira, and H. Bouwmeester, The interaction between strigolactones and other plant hormones in the regulation of plant development. Front. Plant Sci. 4, 199 (2013). 20. P. McAtee et al., A dynamic interplay between phytohormones is required for fruit development, maturation, and ripening. Front. Plant Sci. 4, 79 (2013). 21. J. L. Nemhauser, F. Hong, and J. Chory, Different plant hormones regulate similar processes through largely nonoverlapping transcriptional responses. Cell 126(3), 467–475 (2006). na ń et al., COP1 destabilizes DELLA proteins in Arabidop22. N. Blanco-Touri˜ sis. Proc. Nat. Acad. Sci. 117(24), 13792–13799 (2020). 23. P. Achard et al., Ethylene regulates Arabidopsis development via the modulation of DELLA protein growth repressor function. The Plant Cell 15(12), 2816–2825 (2003). 24. P. Achard et al., The plant stress hormone ethylene controls floral transition via DELLA-dependent regulation of floral meristem-identity genes. Proc. Nat. Acad. Sci. 104(15), 6484–6489 (2007). 25. D.-L. Yang et al., Plant hormone jasmonate prioritizes defense over growth by interfering with gibberellin signaling cascade. Proc. Nat. Acad. Sci. 109(19), E1192–E1200 (2012).

page 300

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Information Processing in Plants

b4205-v2-ch10

301

26. J. Gallego-Bartolomé et al., Molecular mechanism for the interaction between gibberellin and brassinosteroid signaling pathways in Arabidopsis. Proc. Nat. Acad. Sci. 109(33), 13446–13451 (2012). 27. M.-Y. Bai et al., Brassinosteroid, gibberellin and phytochrome impinge on a common transcription module in Arabidopsis. Nat. Cell Biol. 14(8), 810–817 (2012). 28. H. Nakamura et al., Molecular mechanism of strigolactone perception by DWARF14. Nat. Commun. 4(1), 1–10 (2013). 29. S. G. Ubeda-Tom´ as, T. S. Beemster, and M. J. Bennett, Hormonal regulation of root growth: Integrating local activities into global behaviour. Trends in Plant Sci. 17(6), 326–331 (2012). 30. N. F` abregas et al., The Brassinosteroid Insensitive1–Like3, Signalosome complex regulates arabidopsis root development. The Plant Cell 25(9), 3377– 3388 (2013). 31. Rademacher, Eike H. et al., A cellular expression map of the Arabidopsis AUXIN RESPONSE FACTOR gene family. The Plant J. 68(4), 597–606 (2011). 32. T. Vernoux et al., The auxin signalling network translates dynamic input into robust patterning at the shoot apex. Molecular Syst. Biol. 7(1), 508 (2011). 33. L. I. Villalobos, A. Calder´ on et al., A combinatorial TIR1/AFB–Aux/IAA co-receptor system for differential sensing of auxin. Nat. Chem. Biol. 8(5), 477–485 (2012). 34. D. R. Boer et al., Structural basis for DNA binding specificity by the auxindependent ARF transcription factors. Cell 156(3), 577–589 (2014). 35. M. L. Gifford et al., Cell-specific nitrogen responses mediate developmental plasticity. Proceedings of the National Academy of Sciences 105(2), 803–808 (2008). 36. A. S. Iyer-Pascuzzi et al., Cell identity regulators link development and stress responses in the Arabidopsis root. Developmental Cell 21(4), 770–782 (2011). 37. L. Walker et al., Changes in gene expression in space and time orchestrate environmentally mediated shaping of root architecture. The Plant Cell 29(10), 2393–2412 (2017). 38. Rich-Griffin, Regulation of cell type-specific immunity networks in arabidopsis roots. http://www.plantcell.org/content/early/2020/07/22/tpc.20.00154 39. I. Efroni and K. D. Birnbaum, The potential of single-cell profiling in plants. Genome Biol. 17(1), 65 (2016). 40. B. Jones and K. Ljung, Auxin and cytokinin regulate each other’s levels via a metabolic feedback loop. Plant Signaling Behav. 6(6), 901–904 (2011). 41. Z. Yu et al., How plant hormones mediate salt stress responses. Trends in Plant Sci. 25(11), 1117–1130 (2020). 42. R. Finkelstein et al., Molecular aspects of seed dormancy. Ann. Rev. Plant Biol. 59 (2008). 43. M. Seo et al., Regulation of hormone metabolism in Arabidopsis seeds: Phytochrome regulation of abscisic acid. Plant J. 48(3), 354–366 (2006).

page 301

August 3, 2021

302

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch10

M. L. Garc´ıa-G´ omez and G. W. Bassel

44. A. T. Topham et al., Temperature variability is integrated by a spatially embedded decision-making center to break dormancy in Arabidopsis seeds. Proc. Nat. Acad. Sci. 114(25), 6629–6634 (2017). 45. Y. Hart and U. Alon, The utility of paradoxical components in biological circuits. Molecular Cell 49(2), 213–221 (2013). 46. I. G. Johnston and G. W. Bassel, Identification of a bet-hedging network motif generating noise in hormone concentrations and germination propensity in Arabidopsis. J. Roy. Soc. Interface 15(141), 20180042 (2018). 47. Y. Yamauchi et al., Activation of gibberellin biosynthesis and response pathways by low temperature during imbibition of Arabidopsis thaliana seeds. Plant Cell 16, 367–378 (2004).

page 302

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235740 0011

Chapter 11

Hybrid Computer Approach to Train a Machine Learning System Mirko Holzer and Bernd Ulmann∗,† ∗

Department of Business Informatics, FOM University of Applied Sciences for Economics and Management, Frankfurt, Germany † Institute of Medical Systems Biology at Ulm University, Germany This chapter describes a novel approach to training machine learning systems by means of a hybrid computer setup, that is, a digital computer tightly coupled with an analog computer. In this example, a reinforcement learning system is trained to balance an inverted pendulum which is simulated on an analog computer, demonstrating a solution to the major challenge of adequately simulating the environment for reinforcement learning.a

11.1. Introduction The following sections introduce some basic concepts which underpin the remaining parts of this chapter. 11.1.1. A brief introduction to artificial intelligence and machine learning Machine learning is one of the most exciting technologies of our time. It boosts the range of tasks that computers can perform to a level

a

The analog/hybrid approach to this problem has also been described in Ref. [1, Sec. 6.24/7.4] with a focus on the analog computer part. 303

page 303

August 4, 2021

304

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

M. Holzer and B. Ulmann

which has been extremely difficult, if not impossible, to achieve using conventional algorithms. Thanks to machine learning, computers understand spoken language, offer their services as virtual assistants such as Siri or Alexa, diagnose cancer from magnetic resonance imaging (MRI), drive cars, compose music, paint artistic pictures, and became world champion in the board game Go. The latter feat is even more impressive when one considers that Go is probably one of the most complex games ever devisedb ; for example it is much more complex than chess. Many observers in 1997 thought that IBM’s Deep Blue, which defeated the then world chess champion Garry Kasparov, was the first proof-of-concept for Artificial Intelligence (AI). However, since chess is a much simpler game than Go, chess-playing computer algorithms can use try-anderror methods to evaluate the best next move from the set of all possible sensible moves. In contrast, due to the 10360 possible game paths of Go, which is far more than the number of atoms in the universe, it is impossible to use such simple brute-force computer algorithms to calculate the best next move. This is why popular belief stated that it required human intuition, as well as creative and strategic thinking, to master Go — until Google’s AlphaGo beat 18-time world champion Lee Sedol in 2016. How could AlphaGo beat Lee Sedol? Did Google develop human-like Artificial Intelligence with a masterly intuition for Go and the creative and strategic thinking of a world champion? Far from it. Google’s AlphaGo relied heavily on a machine learning technique called reinforcement learning, RL for short, which is at the center of the novel hybrid analog/digital computing approach introduced in this chapter. The phrase AI on the other hand, is nowadays heavily overused and frequently misunderstood. At the time of writing in early 2020,

b

We are referring here to games with complete information, whereas games with incomplete information, such as Stratego, are yet to be conquered by machine learning.

page 304

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

Hybrid Computer Approach to Train a Machine Learning System

305

there exists no AI in the true sense of the word. Three flavors of AI are commonly identified: 1. Artificial Narrow Intelligence (ANI): Focused on one narrow task such as playing Go, performing face recognition, or deciding the credit rating of a bank’s customer. Several flavors of machine learning are used to implement ANI. 2. Artificial General Intelligence (AGI): Computers with AGI would be equally intelligent to humans in every aspect and would be capable of performing the same kind of intellectual tasks that humans perform with the same level of success. Currently it is not clear if machine learning as we know it today, including Deep Learning, can ever evolve into AGI or if new approaches are required for this. 3. Artificial Super Intelligence (ASI): Sometimes called “humanity’s last invention”, ASI is often defined as an intellect that is superior to the best human brains in practically every field, including scientific creativity, general wisdom, and social skills. From today’s perspective, ASI will stay in the realm of science fiction for many years to come. Machine learning is at the core of all of today’s AI/ANI efforts. Two fundamentally different categories of machine learning can be identified today: The first category, which consists of supervised learning and unsupervised learning, performs its tasks on existing data sets. As described in Ref. [2, p. 85] in the context of Supervised Learning: “A dumb algorithm with lots and lots of data beats a clever one with modest amounts of it. (After all, machine learning is all about letting data do the heavy lifting.)”

Reinforcement learning constitutes the second category. It is inspired by behaviorist psychology and does not rely on data. Instead, it utilizes the concept of a software agent that can learn to perform certain actions, which depend on a given state of an environment, in order to maximize some kind of long-term (cumulative) reward. Reinforcement learning is very well suited for all kinds of control

page 305

August 4, 2021

306

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

M. Holzer and B. Ulmann

tasks, as it does not need sub-optimal actions to be explicitly corrected. The focus is on finding a balance between exploration (of uncharted territory) and exploitation (of current knowledge). Here are some typical applications of machine learning arranged by paradigm: Supervised learning: Classification (image classification, customer retention, diagnostics) and regression (market forecasting, weather forecasting, advertising popularity prediction). Unsupervised learning: Clustering (recommender systems, customer segmentation, targeted marketing) and dimensionality reduction (structure discovery, compression, feature elicitation). Reinforcement learning: Robot navigation, real-time decisions, game AI, resource management, optimization problems. For the sake of completeness, (deep) neural networks need to be mentioned, even though they are not used in this chapter. They are a very versatile family of algorithms that can be used to implement all three of supervised learning, unsupervised learning and reinforcement learning. When reinforcement learning is implemented using deep neural networks then the term deep reinforcement learning is used. The general public often identifies AI or machine learning with neural networks; this is a gross simplification. Indeed there are plenty of other approaches for implementing the different paradigms of machine learning as described in Ref. [3]. 11.1.2. Analog versus digital computing Analog and digital computers are two fundamentally different approaches to computation. A traditional digital computer, more precisely a stored program digital computer, has a fixed internal structure, consisting of a control unit, arithmetic units, etc., which are controlled by an algorithm stored as a programin some kind of memory. This algorithm is then executed in a basically stepwise fashion. An analog computer, in contrast, does not have a memory at all and is not controlled by an algorithm. It consists of a multitude

page 306

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

Hybrid Computer Approach to Train a Machine Learning System

307

of computing elements capable of executing basic operations such as addition, multiplication, integration (sic!), etc. These computing elements are then interconnected in a suitable way to form an analogue, a model, of the problem to be solved. So while a digital computer has a fixed internal structure and a variable program controlling its overall operation, an analog computer has a variable structure and no program in the traditional sense at all. The big advantage of the analog computing approach is that all computing elements involved in an actual program are working in full parallelism with no data dependencies, memory bottlenecks, etc. slowing down the overall operation. Another difference is that values within an analog computer are typically represented as voltages or currents and thus are as continuous as possible in the real world.c Apart from continuous value representation, analog computers feature integration over timeintervals as one of their basic functions. When it comes to the solution or simulation of systems described by coupled differential equations, analog computers are much more energy efficient and typically also much faster than digital computers. On the other hand, the generation of arbitrary functions (e.g., functions, which cannot be obtained as the solution of some differential equation), complex logic expressions, etc. are hard to implement on an analog computer. So, the idea of coupling a digital computer with an analog computer, yielding a hybrid computer, is pretty obvious. The analog computer basically forms a high-performance co-processor for solving problems based on differential equations, while the digital computer supplies initial conditions, coefficients, etc. to the analog computer, reads back values and controls the overall operation of the hybrid system.d c

There exist digital analog computers, which is not the contradiction it might appear to be. They differ from classical analog computers mainly by their use of a binary value representation. Machines of this class are called DDAs, short for Digital Differential Analyzer s and are not covered here. d More details on analog and hybrid computer programming can be found in Ref. [1].

page 307

August 4, 2021

308

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

M. Holzer and B. Ulmann

11.1.3. Balancing an inverse pendulum using reinforcement learning Reinforcement learning is particularly well suited for an analog/digital hybrid approach because the RL agent needs a simulated or real environment in which it can perform, and analog computers excel in simulating multitudes of scenarios. The inverse pendulum, as shown in Figure 11.3, was chosen for the validation of the approach discussed in this chapter because it is one of the classical “Hello World” examples of reinforcement learning and it can be easily simulated on small analog computers. The hybrid computer setup consists of the analog computer shown in Figure 11.1 that simulates the inverse pendulum and a digital computer running a reinforcement learning algorithm written in Python.

Figure 11.1.

Analog computer setup

page 308

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

Hybrid Computer Approach to Train a Machine Learning System

309

mode control analog readout

Analog computer

digital outputs

Hybrid controller

USB

Digital computer

digital inputs digital pot. ctrl.

Figure 11.2.

Basic structure of the hybrid computer

Both systems communicate using serial communication over a USB connection. Figure 11.2 shows the overall setup: The link between the digital computer on the right hand side and the analog computer is a hybrid controller, which controls all parts of the analog computer, such as the integrators, digital potentiometers for setting coefficients, etc. This hybrid controller receives commands from the attached digital computer and returns values read from selected computing elements of the analog computer. The simulation and the learning algorithm both run in real-time.e Reinforcement learning takes place in episodes. One episode is defined as “balance the pendulum until it falls over or until the cart moves outside the boundaries of the environment”. The digital computer asks the analog computer for real-time simulation information such as the cart’s x position and the pendulum’s angle ϕ. The learning algorithm then decides if the current episode, and therefore the current learning process, can continue or if the episode needs to be ended; ending the episode also resets the simulation running on the analog computer. 11.2. The Analog Simulation Simulating an inverted pendulum mounted on a cart with one degree of freedom, as shown in Figure 11.3, on an analog computer is quite straightforward. The pendulum mass m is assumed to be mounted on top of a mass-less pole which in turn is mounted on a pivot on a cart which can be moved along the horizontal axis. The cart’s movement is controlled by applying a force F to the left or right side of the cart e

As can be seen in this video: https://youtu.be/jDGLh8YWvNE.

page 309

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

M. Holzer and B. Ulmann

310

l sin(ϕ)

y

m mg

l cos(ϕ) ϕ

l

0

x F

M

Figure 11.3.

Configuration of the inverted pendulum.

for a certain time-interval δt. If the cart couldn’t move, the pendulum would resemble a simple mathematical pendulum described by ϕ¨ −

g sin(ϕ) = 0, l

where ϕ¨ is the second derivative of the pendulums angle ϕ with respect to time.f Since in this example the cart is non-stationary, the problem is much more complex and another approach has to be taken. In this case, the Lagrangian g L=T −V is used, where T and V represent the total kinetic and potential energy of the overall system. With g representing the gravitational 2

In engineering notations like ϕ˙ = dϕ ,ϕ ¨ = ddtϕ 2 , etc. are commonly used to denote dt derivatives with respect to time. g This approach, which lies at the heart of Lagrangian mechanics, allows the use of so-called generalized coordinates like, in this case, the angle ϕ and the cart’s position x instead of the coordinates in classic Newtonian mechanics. More information on Lagrangian mechanics can be found in Ref. [4]. f

page 310

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

Hybrid Computer Approach to Train a Machine Learning System

311

acceleration the potential energy is V = mgl cos(ϕ), the height of the pendulum’s mass above the cart’s upper surface. The kinetic energy is the sum of the kinetic energies of the pendulum bob with mass m and the moving cart with mass M T =

1 M vc2 + mvp2 , 2

where vc and vp represent the velocities of the cart and the pendulum, respectively. With x denoting the horizontal position of the cart, the cart’s velocity vc is just the first derivative of its position x with respect to time: ˙ vc = x. The pendulum’s velocity is a bit more convoluted since the velocity of the pendulum mass has two components, one along the xand one along the y-axis forming a two-component vector. The scalar velocity is then the Euclidean norm of this vector, that is, 2 2 d d (x − l sin(ϕ)) + (l cos(ϕ)) . vp = dt dt The two components under the square root are 2 d (x − l sin(ϕ)) = (x˙ − lϕ˙ cos(ϕ))2 dt = x˙ 2 − 2xl ˙ ϕ˙ cos(ϕ) + l2 ϕ˙ 2 cos2 (ϕ) and

d (l cos(ϕ)) dt

2 = (−lϕ˙ sin(ϕ))2 = l2 ϕ˙ 2 sin2 (ϕ)

resulting in ˙ ϕ˙ cos(ϕ) + l2 ϕ˙ 2 cos2 (ϕ) + l2 ϕ˙ 2 sin2 (ϕ) vp = x˙ 2 − 2xl ˙ cos(ϕ) + l2 ϕ˙ 2 . = x˙ 2 − 2x˙ ϕl

page 311

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

M. Holzer and B. Ulmann

312

This finally yields the Lagrangian 1 2 1 M x˙ 2 + x˙ − 2x˙ ϕl ˙ cos(ϕ) + l2 ϕ˙ 2 − mgl cos ϕ 2 2 1 1 ˙ cos(ϕ) + ml2 ϕ˙ 2 − mgl cos(ϕ). = (M + m)x˙ 2 + mx˙ ϕl 2 2

L=

As a next step, the Euler–Lagrange-equations ∂L d ∂L − = F and dt ∂ x˙ ∂x ∂L d ∂L − =0 dt ∂ ϕ˙ ∂ϕ

(11.1) (11.2)

are applied. The first of these equations requires the following partial derivatives: ∂L = 0, ∂x ∂L = (M + m)x˙ − mlϕ˙ cos(ϕ), ∂ x˙ d ∂L = (M + m)¨ x − mlϕ ¨ cos(ϕ) + mlϕ˙ 2 sin(ϕ), dt ∂ x˙ while the second equations relies on these partial derivatives: ∂L = mlx˙ ϕ˙ sin(ϕ) + mgl sin(ϕ), ∂ϕ ∂L = −mlx˙ cos(ϕ) + ml2 ϕ, ˙ ∂ ϕ˙ d ∂L = −ml¨ x cos(ϕ) + mlx˙ ϕ˙ sin(ϕ) + ml2 ϕ. ¨ dt ∂ ϕ˙ Substituting these into Equations (11.1) and (11.2) yields the following two Euler–Lagrange-equations: d ∂L = (M + m)¨ x − mlϕ¨ cos(ϕ) + mlϕ˙ 2 sin(ϕ) = F (11.3) dt ∂ x˙

page 312

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

Hybrid Computer Approach to Train a Machine Learning System

and d dt

∂L ∂ ϕ˙

313

= −ml¨ x cos(ϕ) + mlx˙ ϕ˙ sin(ϕ) + ml2 ϕ¨ − mlx˙ ϕ˙ sin(ϕ) − mgl sin(ϕ) = 0.

Dividing this last equation by ml and solving for ϕ¨ results in 1 (¨ x cos(ϕ) + g sin(ϕ)) l which can be further simplified to ϕ¨ =

ϕ¨ = x ¨ cos(ϕ) + g sin(ϕ)

(11.4)

by assuming the pendulum having fixed length l = 1. The two final equations of motion (11.3) and (11.4) now fully describe the behaviour of the inverted pendulum mounted on its moving cart, to which an external force F may be applied in order to move the cart and therefore the pendulum due to its inertia. To simplify things further, it can be reasonably assumed that the mass m of the pendulum bob is negligible compared with the cart’s mass M , so that (11.3) can be rewritten as Mx ¨ = F.

(11.5)

This simplification comes at a cost: the movement of the pendulum bob no longer influences the cart, as it would have been the case with a non-negligible pendulum mass m. Nevertheless, this is not a significant restriction and is justified by the the resulting simplification of the analog computer program shown in Figure 11.4. As stated before, an analog computer program is basically an interconnection scheme specifying how the various computing elements are to be connected to each other. The schematic makes use of several standard symbols: • Circles with inscribed values represent coefficients. Technically these are basically variable voltage dividers, so a coefficient must always lie in the interval [0, 1]. The input value x ¨ (the force applied to the cart in order to move it on the x-axis) is applied to two such voltage dividers, each set to values γ1 and γ2 .

page 313

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

page 314

M. Holzer and B. Ulmann

314 cos(ϕ)

βϕ

γ1

ϕ˙ g

x ¨ γ1

−x˙

y

cos(. . . ) −ϕ

sin(ωt)

−1

h h

sin(. . . ) x

sin(ϕ) x

−x

βx

Figure 11.4.

Modified setup for the inverted pendulum.

• Triangles denote summers, yielding the sum of all of its inputs at its output. It should be noted that summers perform an implicit sign inversion, so feeding two values −h and h sin(ωt) to a summer as shown in the right half of the schematic, yields an output signal of −(h sin(ωt) − h) = h(1 − sin(ωt)). • Symbols labelled with +Π denote multipliers while those labelled with cos(. . . ) and sin(. . . ), respectively are function generators. • Last but not least, there are integrators denoted by triangles with a rectangle attached to one side. These computing elements yield the time integral over the sum of their respective input values. Just as with summers, integrators also perform an implicit sign-inversion. Transforming the equations of motion (11.4) and (11.5) into an analog computer program is typically done by means of the Kelvin feedback technique.h The tiny analog computer subprogram shown in Figure 11.5 shows how the force applied to the cart is generated in the hybrid computer setup. The circuit is controlled by two digital output signals from the hybrid controller, the interface between the analog computer and its digital counterpart. Activating the output D0 for a short time interval δt generates a force impulse resulting in an acceleration x ¨ of the cart. The output D1 controls the direction of x ¨, thus allowing the cart to be pushed to the left or the right. Both of these control h More information on this can be found in classic text books on analog computing or in Ref. [1].

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

Hybrid Computer Approach to Train a Machine Learning System

push the cart manually

−1 +1

315

x ¨

+1 −1 D1 D0 Figure 11.5.

Control circuit for the controlled inverted pendulum.

signals are connected to electronic switches which are part of the analog computer. The outputs of these two switches are then fed to a summer which also takes a second input signal from a manual operated single pole double throw (SPDT) switch. This switch is normally open and makes it possible for an operator to manually disturb the cart, thereby unbalancing the pendulum. It is interesting to see how the RL system responds to such external disturbances. 11.3. The Reinforcement Learning System Reinforcement learning, as defined in Ref. [5, Sec. 1.1], “is learning what to do — how to map situations to actions — so as to maximize a numerical reward signal. The learner (agent) is not told which actions to take, but instead must discover which actions yield the most reward by trying them. In the most interesting and challenging cases, actions may affect not only the immediate reward but also the next situation and, through that, all subsequent rewards. These two characteristics — trial-and-error search and delayed reward — are the two most important distinguishing features of reinforcement learning.”

In other words, reinforcement learning utilizes the concept of an agent that can learn to perform certain actions depending on a given state of an enivronment in order to maximize some kind of long-term (delayed) reward. Figure 11.6i illustrates the interplay of i Source https://en.wikipedia.org/wiki/Reinforcement learning, retrieved January 9th, 2020.

page 315

August 4, 2021

316

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

M. Holzer and B. Ulmann

Rewa

rd

Action

Environment

Interpreter

Agent Figure 11.6.

Reinforcement learning.

the components of a RL system: In each episode j , the agent observes the state s ∈ S of the system: position x of the cart, speed x, ˙ angle ϕ of the pole and angular velocity ϕ. ˙ Depending on the state, the agent performs an action that modifies the environment. The outcome of the action determines the new state and the short-term reward, which is the main ingredient in finding a value function that is able to predict the long-term reward. Figure 11.7 translates these abstract concepts into the concrete use-case of the inverted pendulum. The short-term reward is just a means to an end for finding the value function: focusing on the short-term reward would only lead to an ever-increasing bouncing of the pendulum, equal to the control algorithm diverging. Instead, using the short-term reward to approximate the value function that can predict the long-term reward leads to a robust control algorithm (convergence). 11.3.1. Value function Roughly speaking, the state value function V estimates “how benficial” it is to be in a given state. The action value function Q specifies j

See also Section 11.1.3 for a definition of episode.

page 316

August 4, 2021

F =0

−−−−−−−−−−−−−−→ rshort (si+1 ,ai+1 )=max

... The pendulum bounces back and forth with ever increasing angle

Agent focuses on long term reward:

small F < 0

−−−−−−−−−−−→ rshort (si ,ai ) 0 5: Initialize Q(s, a) ∀ s ∈ S, a ∈ A(s), arbitrarily 6: except that Q(terminal state, ·) = 0 7: Loop for each episode: 8: Initialize s 9: while s = terminal state do 10: if random number < ε then 11: a ← random action ∈ A(s)

means: Explore 12: else 13: a ← argmaxa (Q(s, a))

means: Exploit 14: 15: 16: 17: 18: 19:

Take action a r ← reward for taking action a while in state s s ← state that follows due to taking action a abest ← argmaxa (Q(s , a)) Q(s, a) ← Q(s, a) + α[r + γ ∗ Q(s , abest ) − Q(s, a)] s ← s

in Section 11.3.1, in the context of this chapter the policy π is a greedy policy that always chooses the best possible next action, that is, argmaxa (Q(s, a)). Algorithm 1 is a temporal difference learner based on Bellman’s equation and is inspired by Sutton (Ref. [5, Sec. 6.5]). The state space of the inverse pendulum is denoted as S = {(x, x, ˙ ϕ, ϕ) ˙ ∈ R4 } and A m A : S → 2 describes all valid actions for all valid states. In plain english, the algorithm can be described as follows. For each step within an episode: Decide, if the next action shall explore new territory or exploit knowledge, that has already been learned. After this decision: Take the appropriate action a, which means m It is possible that not all actions that are theoretically possible in an environment are valid in all given states.

page 320

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

Hybrid Computer Approach to Train a Machine Learning System

321

that the reward r for this action is collected and the system enters a new state s (see Figure 11.6). After that: Update the action value function Q(s, a) with an estimated future reward (line 18 of Algorithm 1), but discount the estimated future reward with the learning rate α. In machine learning, the learning rate is a very important parameter in general, because it makes sure that new knowledge that has been obtained in the current episode does not completely “overwrite” past knowledge. The calculation of the estimated future reward is obviously where the magic of Q-learning is happening: Q(s, a) ← Q(s, a) + α[r + γ ∗ Q(s , abest ) − Q(s, a)]. Q-learning estimates the future reward by taking the short-term reward r for the recently taken action a and adding the discounted delta between the best possible outcome of a hypothetical next action abest (given the new state s ) and the old status quo Q(s, a). It is worth mentioning that adding the best possible outcome of s aka abest = argmaxa (Q(s , a)) can only be a guess, because it is unclear at the time of this calculation if action abest is ever being taken due to the “explore versus exploit” strategy. Still, it seems logical that if Q(s, a) claims to be a measure of “how beneficial” a certain action is, that “the best possible future” abest from this starting point onwards is then taken into consideration when estimating the future reward. The discount rate γ takes into consideration that the future is not predictable and therefore future rewards cannot be fully counted on. In other words, γ is a parameter that balances the RL agent between the two opposite poles “very greedy and short-sighted” and “decision making based on extremely long-term considerations.” As such, γ is one of many so called hyper parameters; see also Section 11.3.3.4. Given the size of S it becomes clear that many episodes are needed to train a system using Q-learning. A rigorous proof that Q-learning works and that the algorithm converges is given in Ref. [6]. When implementing Q-learning, a major challenge is how to represent the function Q(s, a). For small to medium sized problems that can be discretized with reasonable effort, tabular representations

page 321

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

322

b4205-v2-ch11

M. Holzer and B. Ulmann

are appropriate. But as soon as the state space S is large, as in the case of the inverse pendulum, other representations need to be found. 11.3.3. Python implementation The ideas of this chapter culminate in a hybrid analog/digital machine learning implementation that uses reinforcement learning, more specifically the Q-learning algorithm, in conjunction with linear regression, to solve the challenge of balancing the inverse pendulum. This somewhat arbitrary choice by the authors is not meant to claim that using RL and Q-learning is the only or at least optimal way of balancing the inverse pendulum. On the contrary, it is actually more circuitous than many other approaches. Experiments performed by the authors in Python have shown that a simple random search over a few thousand tries yields such θ = (θ1 , θ2 , θ3 , θ4 ), that the result of the dot-multiplication with the current state of the simulation s = (x, x, ˙ ϕ, ϕ) ˙ can reliably decide if the cart should be pushed to the left or to the right, depending, on whether or not θ · s is larger than zero. So the intention of this chapter is something else. The authors wanted to show that a general purpose machine learning algorithm like Q-learning, implemented as described in this chapter, is suitable without change for a plethora of other (and much more complex) real world applications and that it can be efficiently trained using a hybrid analog/digital setup. Python offers a vibrant Open Source ecosystem of building-blocks for machine learning, from which scikit-learn, as described in Ref. [7], was chosen as the foundation of our implementationn . 11.3.3.1. States Each repetition of the While loop in line 9 of algorithm 1 represents one step within the current iteration. The While loop runs until the current state s ∈ S of the inverse pendulum’s simulation reaches a terminal state. This is when the simulation of one episode ends, n

Full source code on GitHub: https://git.io/Jve3j.

page 322

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

Hybrid Computer Approach to Train a Machine Learning System

323

the simulation is reset and the next episode begins. The terminal state is defined as the cart reaching a certain position |x| which can be interpreted as “the cart moved too far and leaves the allowed range of movements to the left or to the right or bumps into a wall”. Another trigger for the terminal state is that the angle |ϕ| of the pole is greater than a certain predefined value, which can be interpreted as “the pole falls over”. In Python, the state s ∈ S is represented as a standard Python Tuple of Floats containing the four elements (x, x, ˙ ϕ, ϕ) ˙ that constitute the state. During each step of each episode, the analog computer is queried for the current state using the hc get sim state() function (see also Section 11.3.4). On the analog computer, the state is by the nature of analog computers a continuous function. As described in Ref. [8, pp. 3–42] an environment in reinforcement learning is typically stated in the form of a Markov decision process (MDP), because Q-learning and many other reinforcement learning algorithms utilize dynamic programming techniques. As an MDP is a discrete time stochastic control process, this means that for the Python implementation, we need to discretize the continuous state representation of the analog computer. Conveniently, this is happening automatically, because the repeated querying inside the While loop mentioned above is nothing else than sampling and therefore a discretization of the continuous (analog) state.o 11.3.3.2. Actions In theory, one can think of an infinitep number of actions that can be performed on the cart on which the inverted pendulum is mounted. As this would complicate the implementation of the Q-learning algorithm, the following two simplifications where chosen while implementing actions in Python: o

Due to the partially non-deterministic nature of the Python code execution, the sampling rate is not guaranteed to be constant. The slight jitter introduced by this phenomenon did not notably impede the Q-learning. p Example: Push the cart from the left with force 1, force 2, force 3, force 4, . . .

page 323

August 4, 2021

16:47

324

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

M. Holzer and B. Ulmann

• There are only two possible actions a ∈ {0, 1}: “Push the cart to the left” and “push the cart to the right”. • Each action a is allowed in all states s ∈ S. The actions a = 0 and a = 1 are translated to the analog computer simulation by applying a constant force from the right (to push the cart to the left) or the other way round for a defined constant period of time. The magnitude of the force that is being applied is configured as a constant input for the analog computer using a potentiometer, as described in Section 11.2. The constant period of time for which the force is being applied can be configured in the Python software module using HC IMPULSE DURATION. 11.3.3.3. Modeling the action value function Q(s, a) A straightforward way of modeling the action value function in Python could be to store Q(s, a) in a Python Dictionary, so that the Python equivalent of line 18 in algorithm 1 would look like this: dictionary.py 1

Q_s_a = {} #create a dictionary to represent Q(s, a)

2 3

[...]

#perform the Q-learning algorithm

4 5 6 7

#learn by updating Q(s, a) using learning rate alpha Q_s_a[((x, xdot, phi, phidot), a)] = old_q_s_a + alpha * predicted_reward dictionary.py

There are multiple problems with this approach, where even the obviously huge memory requirement for trying to store s ∈ S in a tabular structure is not the largest one. An even greater problem is that Python’s semantics of Dictionaries are not designed to consider similarities. Instead, they are designated as key-value pairs. For a Python dictionary the two states s1 = (1.0, 1.0, 1.0, 1.0) and s2 = (1.0001, 1.0, 1.0, 1.0) are completely different and not at all similar. In contrast, for a control algorithm that balances a pendulum, a cart being at x-position 1.0 is in a very similar situation (state) to a cart being at x-position 1.0001. So Python’s

page 324

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

Hybrid Computer Approach to Train a Machine Learning System

325

Dictionaries cannot be used to model Q(s, a). Also, trying to use other tabular methods such as Python’s Lists, creates many other challenges. This is why linear regression has been chosen to represent and model the function Q(s, a). As we only have two possible states a ∈ {0, 1}, no general purpose implementation of Q(s, a) has been done. Instead, two discrete functions Qa=0 (s) and Qa=1 (s) are modeled using a linear regression algorithm from scikit-learn called SGDRegressor,q which is capable of performing online learning. In online machine learning, data becomes available step by step and is used to update the best predictor for future data at each step as opposed to batch learning, where the best predictor is generated by using the entire training data set at once. Given m input features fi (i = 1 . . . m) and corresponding coefficients θi (i = 0 . . . m), linear regression can predict yˆ as follows: yˆ = θ0 + θ1 f1 + θ2 f2 + · · · + θm fm . The whole point of SGDRegressor is to iteratively refine the values y , fi (i = 1, . . . , m)] are generated for all θi , as more and more pairs of [ˆ during each step of each episode of the Q-learning algorithm with more and more accuracy due to the policy iteration. Older pairs are likely less accurate estimates and are therefore discounted. A natural choice of features for linear regression would be to set m = 4 and to use the elements of the state s ∈ S as the input features f of the linear regression: ˙ Qa=n (s) = θ0 + θ1 x + θ2 x˙ + θ3 ϕ + θ4 ϕ. Experiments performed during the implementation have shown that this choice of features does not produce optimal learning results, as it would lead to underfitting, that is, the learned model makes too rough predictions for the pendulum to be balanced. Section 11.3.3.4 explains this phenomenon. SGDRegressor offers built-in mechanisms to handle the learning rate α. When constructing the SGDRegressor object, α can be q

SGD stands for Stochastic Gradient Descent.

page 325

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

M. Holzer and B. Ulmann

326

directly specified as a parameter. Therefore, line 18 of algorithm 1 on page 320 is simplified in the Python implementation, as “the α used in the Q-learning algorithm” and “the α used inside the SGDRegressor” are semantically identical: The purpose of both of them is to act as the learning rate. Thus, the “the α used in the Q-learning algorithm” can be omitted (means α can be set to 1): QSGDReg (s, a) ← r + γ ∗ Q(s , abest ). It needs to be mentioned, that the SGDRegressor is not just overwriting the old value with the new value when the abovementioned formula is executed. Instead, when its online learning function partial fit is called, it improves the existing predictor for Q(s, a) by using the new value r + γ ∗ Q(s , abest ) discounted by the learning rate α. In plain english, calling partial fit is equivalent to “the linear regressor that is used to represent Q(s, a) in Python updating its knowledge about the action value function Q for the state s in which action a has been taken by a new estimate without forgetting what has been previously learned. The new estimate that is used to update Q consists of the short-term reward r that came as a result of taking action a while being in step s plus the discounted estimated longterm reward γ ∗ Q(s , abest ) that is obtained by acting as if after action a has been taken, in future always the best possible future action will be taken.” analog-cartpole.py 1 2 3 4 5

# List of possible actions that the RL agent can perform in # the environment. For the algorithm, it doesn’t matter if 0 # means right and 1 left or vice versa or if there are more # than two possible actions env_actions = [0, 1]

6 7

[...]

8 9 10 11 12

# # # #

we use one Linear Regression per possible action to model the Value Function for this action, so rbf_net is a list; SGDRegressor allows step-by-step regression using partial_fit, which is exactly what we need to learn

page 326

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

Hybrid Computer Approach to Train a Machine Learning System 13 14 15 16 17

327

rbf_net = [SGDRegressor(eta0=ALPHA, power_t=ALPHA_DECAY, learning_rate=’invscaling’, max_iter=5, tol=float("-inf")) for i in range(len(env_actions))]

18 19

[...]

20 21 22 23 24

# learn Value Function for action a in state s def rl_set_Q_s_a(s, a, val): rbf_net[a].partial_fit(rl_transform_s(s), rl_transform_val(val))

25 26

[...]

27 28 29 30

# Learn the new value for the Value Function new_value = r + GAMMA * max_q_s2a2 rl_set_Q_s_a(s, a, new_value) analog-cartpole.py

The source snippet shows that there is a regular Python List which contains twor objects of type SGDRegressor. Therefore rbf net[n] can be considered as the representation of Qa=n (s) in memory. For increasing the accuracy of Q during the learning process, the function rl set Q s a(s, a, val) which itself uses SGDRegressor’s partial fit function is called regularly at each step of each episode. Therefore, lines 29–30 are equivalent to line 18 of Algorithm 1. 11.3.3.4. Feature transformation to avoid underfitting Underfitting occurs when a statistical model or machine learning algorithm cannot capture the underlying trend of the data. Intuitively, underfitting occurs when the model or the algorithm does not fit the data well enough, because the complexity of the model is too low. As shown in Section 11.2, the inverse pendulum is a nonlinear function so one might think that this is the reason that a simple r for i in range(len(env actions)) leads to two iterations as env actions contains two elements.

page 327

August 4, 2021

328

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

M. Holzer and B. Ulmann

linear regression which tries to match the complexity of a nonlinear function using s ∈ S as feature set might be prone to underfitting. In general, this is not the case. As shown at the beginning of this section on page 322 where a control algorithm is introduced, that is based on a randomly found θ and that controls the cart via the linear dot multiplication θ · s and where both vectors consist of mere four elements, simple linear functions can absolutely control a complex non-linear phenomenon. So, linear control functions per se are not the reason for underfitting. Instead, the reason why underfitting occurs in the context of the Q-learning algorithm is that the very concept of reinforcement learning’s value function, which is a general purpose concept for machine learning, creates overhead and complexity. And this complexity needs to be matched by the statistical model — in our case linear regression — that is chosen to represent Q(s, a). The value function is more complex than a mere control policy like the one that is described on page 322, because it not only contains the information necessary to control the cart but it also contains the information about the expected future reward. This surplus of information needs to be stored somewhere (i.e., needs to be fitted by the model of choice). The model complexity of linear regression is equivalent to the number m of input features fi (i = 1, . . . , m) in the linear regression equation yˆ = θ0 + θ1 f1 + θ2 f2 + · · · + θm fm . Finding out the optimal threshold of the model complexity necessary to avoid underfitting is a hard task that has not been solved in data science in general at the time of writing. Therefore the model complexity is another one of the many hyper parameters that need to be found and fine-tuned before the actual learning process begins. Finding the right hyper parameters often requires a combination of intuition and trial and error. Consequently the challenge that had to be solved by the authors was: When the model complexity of linear regression needs to be increased, more features are needed. But how can more than four features be generated, given that s ∈ S only consists of the four features x, x, ˙ ϕ, ϕ? ˙ The solution is to perform a feature transformation, where the number of new features after transforming s ∈ S is significantly

page 328

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

Hybrid Computer Approach to Train a Machine Learning System

329

bigger than four (x, x, ˙ ϕ, ϕ) ˙ → (f1 , . . . , fm ) (i = 1, . . . , m, m 4). Since the open question here is “how much means significantly?”, one of the requirements for a good feature transformation function is flexibility in the sense that it must be as easy to adjust the hyperparamter “model complexity m” as it is to adjust some constants in the Python code (versus finding a completely new feature transformation function each time m needs to be increased or decreased). It is a best practice of feature engineering that a feature transformation introduces a certain nonlinearity. This can be done using many different options such as using a polynomial over the original features. Radial Basis Functions (RBF) are another option and have been chosen as a means of feature transformation that allows to increase the model complexity m of the linear regression. An RBF is a function that maps a vector to a real number by calculating (usually) the Euclidian distance from the function’s argument to a previously defined center which is sometimes also called an exemplar. This distance is then used inside a kernel function to obtain the output value of the RBF. By using a large number of RBF transformations, where each RBF uses a different, randomly chosen, center, a high number m of input features fi can be generated for the linear regression. In other words: A feature map of s is generated by applying m different RBFs with m different random centers on s s = (x, x, ˙ ϕ, ϕ) ˙ centeri = random(xi , x˙i , ϕi , ϕ˙ i ) for all i = 1 . . . m, δi = distanceeuclidi = (s − centeri ) RBFi : s ∈ S → yi ∈ R RBFi : e−(βδi ) → yi . 2

(11.6)

page 329

August 4, 2021

16:47

330

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

M. Holzer and B. Ulmann

β is a constant for all i = 1, . . . , m that defines the shape of the bell curve described by the Gaussian RBF kernel applied here. So, in summary, the method described here is generating m features from the original four features of s where due to the fact that the Eucledian distance to the centers is used, similar states s are yielding similar transformation results y. As long as this similarity premise holds and as long as the feature transformation is not just a linear combination of the original features but adds nonlinearity, it actually does not matter for the purposes of adding more model complexity to the linear regression, what kind of feature transformation is applied. RBFs are just one example. Due to the fact that it is beneficials in the context of the analog computer simulation to achieve near real-time performance of the Python software, the above-mentioned RBF transformation has not been implemented verbatim in Python. This would have been too slow. Instead, the scikit-learn class RBFSampler has been used, which generates the feature map of an RBF kernel using a Monte Carlo approximation of its Fourier transform. The experiments have shown that RBFSampler is fast enough and that the approximation is good enough.t analog-cartpole.py

# The following four constants are tunable hyperparameters. # Please note that the source code uses the term GAMMA for # denoting what we are calling BETA in this chapter: The # shape of the bell curve. RBF_EXEMPLARS = 250 # amount of exemplars per "gamma # instance" of the RBF network RBF_GAMMA_COUNT = 10 # amount of "gamma instances", i.e. # RBF_EXEMPLARS*RBF_GAMMA_COUNT feat. RBF_GAMMA_MIN = 0.05 # minimum gamma, linear interpolation # between min and max RBF_GAMMA_MAX = 4.0 # maximum gamma

1 2 3 4 5 6 7 8 9 10 11 12

[...]

13

s

High Python program execution performance is helpful to achieve a high sampling rate as described in Section 11.3.3.1 and therefore a high accuracy. t See also https://tinyurl.com/RBFSampler.

page 330

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

Hybrid Computer Approach to Train a Machine Learning System

331

14 15 16 17

# create scaler and fit it to the sampled observation space scaler = StandardScaler() scaler.fit(clbr_res)

18 19

[...]

20 21 22 23 24 25 26 27 28 29

# the RBF network is built like this: create as many # RBFSamplers as RBF_GAMMA_COUNT and do so by setting the # "width" parameter GAMMA of the RBFs as a linear # interpolation between RBF_GAMMA_MIN and RBF_GAMMA_MAX gammas = np.linspace(RBF_GAMMA_MIN, RBF_GAMMA_MAX, RBF_GAMMA_COUNT) models = [RBFSampler(n_components=RBF_EXEMPLARS, gamma=g) for g in gammas]

30 31 32 33 34 35 36 37 38 39 40 41

# we will put all these RBFSamplers into a FeatureUnion, so # that our Linear Regression can regard them as one single # feature space spanning over all "Gammas" transformer_list = [] for model in models: # RBFSampler just needs the dimensionality, # not the data itself model.fit([[1.0, 1.0, 1.0, 1.0]]) transformer_list.append((str(model), model)) # union of all RBF exemplar’s output rbfs = FeatureUnion(transformer_list)

42 43

[...]

44 45 46 47 48 49 50 51 52 53 54

# transform the 4 features (Cart Position, Cart Velocity, # Pole Angle and Pole Velocity At Tip) into # RBF_EXEMPLARS*RBF_GAMMA_COUNT distances from the # RBF centers ("Exemplars") def rl_transform_s(s): # during calibration, we do not have a scaler, yet if scaler == None: return rbfs.transform( np.array(s).reshape(1, -1)) else:

page 331

August 4, 2021

332 55 56 57

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

M. Holzer and B. Ulmann

return rbfs.transform( scaler.transform( np.array(s).reshape(1, -1)))

58 59 60 61 62

# SGDRegressor expects a vector, so we need to transform # our action, which is 0 or 1 into a vector def rl_transform_val(val): return np.array([val]).ravel() analog-cartpole.py

Section 11.3.3.3 is missing an explanation for the two functions rl transform s and rl transform val used in lines 23 and 24 of the source code snippet shown there. The latter one is just a technical necessity, because SGDRegressor expects a vector. In our model, the action is not a vector but an integer, so rl transform val ensures, that we transform the integer a into the vector a = (a). The actual feature transformation is taking place in rl transform s. As described above, the scikit-learn class RBFSampler is used for performance reasons. The model complexity m is defined by two constants in Python: RBF EXEMPLARS and RBF GAMMA COUNTu and m is the product of both of them. As an 2 enhancement to the formula RBFi : e−(βδi ) → yi explained on page 329, where the shape parameter β of the Gaussian curve is kept constant, the solution here creates a linear interpolation between the value RBF GAMMA MIN and RBF GAMMA MAX consisting of RBF GAMMA COUNT distinct values for β. The overall result of m RBF transformations is united in a Python FeatureUnion so that the function rl transform s can conveniently call the object rbfs to execute the transformation. The experiments have shown that the learning efficiency is higher when the values that are coming from the analog computer are processed as they are instead of being scaled before. Therefore, the full source code on GitHub (see Footnote n) contains a boolean switch called PERFORM CALIBRATION that is set to False in the final version. u The source code calls β not BETA in this context but GAMMA. In contrast, γ is used to denote the discount factor in this chapter.

page 332

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

Hybrid Computer Approach to Train a Machine Learning System

333

This means that the if branch shown in lines 48 and 49 is the one performing the feature transformation. 11.3.3.5. Decaying α and ε to improve learning and to avoid overfitting Our experiments have shown that it makes sense to start with relatively high values for the learning rate α and the explore vs. exploit probability ε and then to decay them over time. This is the equivalent of trying many different options at the beginning when there is still very little knowledge available (high ε means to favor explore over exploit) and learning quickly from those explorations (high α). Later when knowledge has accumulated, it takes more effort to modify facts that have been learned earlier (low α). SGDRegressor has a built in function to decay alpha over time. The ε decay has been implemented manually. The following code snippet shows the actual parameters used in the implementation. analog-cartpole.py

GAMMA ALPHA ALPHA_DECAY EPSILON EPSILON_DECAY_t EPSILON_DECAY_m

1 2 3 4 5 6

= = = = = =

0.999 0.6 0.1 0.5 0.1 10

# # # # # #

discount factor for Q-learning initial learning rate learning rate decay randomness for epsilon-greedy decay parameter for epsilon ditto

analog-cartpole.py

Decaying α and ε is not only used to improve the learning itself, but also to avoid overfitting the model. Overfitting is describedv as the “production of an analysis which corresponds too closely or exactly to a particular set of data, and may therefore fail to fit additional data or predict future observations reliably”. Another mechanism that was used to avoid overfitting is to make sure that the system is “not learning for too long”. The concrete meaning of “too long” is — as many things in machine learning — another hyper parameter. For being able to persistently access the (near to) optimal learning state, the Python implementation v

https://www.lexico.com/definition/overfitting

page 333

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

334

b4205-v2-ch11

M. Holzer and B. Ulmann

supports a command line parameter, that forces the current state of the “brain” to be saved every PROBE episodes, whereas PROBE is a constant (hyper parameter). 11.3.4. Hybrid interface The Analog Paradigm Model-1w analog computer used by the authors incorporates a Hybrid Controller for connecting the Model-1 to digital computers. It uses an RS232 over USB mechanism that is compatible with most PC operating systems, in the sense that the PC operating system is able to provide a virtual serial port that behaves exactly as if the analog and the digital computer were connected by an RS232 cable instead of an USB cable. This is why, in Python, the communication uses the pySerialx library. The following code snippet shows the setup parameters. Note that the Hybrid Controller is communicating at 250,000 baud. analog-cartpole.py

# Hybrid Controller HC_PORT HC_BAUD HC_BYTE HC_PARITY HC_STOP HC_RTSCTS HC_TIMEOUT

1 2 3 4 5 6 7 8

serial setup = "/dev/cu.usbserial-DN050L1O" = 250000 = 8 = serial.PARITY_NONE = serial.STOPBITS_ONE = False = 2

9

[...]

10 11

hc_ser = serial.Serial( port=HC_PORT, baudrate=HC_BAUD, bytesize=HC_BYTE, parity=HC_PARITY, stopbits=HC_STOP, rtscts=HC_RTSCTS, dsrdtr=False, timeout=HC_TIMEOUT)

12 13 14 15 16 17 18 19

analog-cartpole.py w x

http://analogparadigm.com/products.html. https://pypi.org/project/pyserial.

page 334

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

Hybrid Computer Approach to Train a Machine Learning System

335

Analog computers operate in different modes. In this machine learning implementation, the Python script controls the analog computer’s mode of operation according to the needs of the algorithm: Initial condition marks the beginning of an episode. The pendulum is in an upright position. Operate is the standard mode of operation, where the analog computer is running the simulation in real-time. Halt means that the simulation is paused and that it can be resumed any time by returning to the Operate mode. The Hybrid Interface accepts certain mode-change commands via the serial line to put the analog computer into the appropriate mode. Moreover, there are several commands to read data from the various computing elements in the analog computer. All the computing elements can be referenced by unique addresses. For being able to influence the calculations of the analog computer, the Hybrid Controller provides analog and digital inputs and outputs. For the purposes of this problem, only two digital outputs are needed to drive the modely : Digital Out #0 is used to set the direction from which the force is being applied when pushing the cart Digital Out #1 makes sure that a force is applied to the cart as long as it is being set to 1 analog-cartpole.py

# Addresses of the environment/simulation data HC_SIM_X_POS = "0223" # address cart x HC_SIM_X_VEL = "0222" # address cart x-velocity HC_SIM_ANGLE = "0161" # address pendulum angle HC_SIM_ANGLE_VEL = "0160" # address pend. angular vel.

1 2 3 4 5 6

HC_SIM_DIRECTION_1 HC_SIM_DIRECTION_0 HC_SIM_IMPULSE_1 HC_SIM_IMPULSE_0

7 8 9 10

y

See Section 11.2.

= = = =

"D0" "d0" "D1" "d1"

# # # #

dout: dout: dout: dout:

cart direct. = 1 cart direct. = 0 apply force apply NO force

page 335

August 4, 2021

336

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

M. Holzer and B. Ulmann

11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

# Model-1 Hybrid Controller: commands HC_CMD_RESET = "x" # reset hybrid controller HC_CMD_INIT = "i" # initial condition HC_CMD_OP = "o" # start to operate HC_CMD_HALT = "h" # halt/pause HC_CMD_GETVAL = "g" # set address of analog # computing element and # return value and ID HC_CMD_BULK_DEFINE = "G" # set addresses of multiple # elements to be returned # in a bulk transfer via "f" HC_CMD_BULK_FETCH = "f" # fetch values of all # addresses defined by "G" [...]

26 27 28

def hc_send(cmd): hc_ser.write(cmd.encode("ASCII"))

29 30 31 32 33 34 35

def hc_receive(): # HC ends each communication with "\n", #so we can conveniently use readline return hc_ser.readline(). decode("ASCII"). split("\n")[0] analog-cartpole.py

As described in Section 11.3.3.1, the digital computer samples the analog computer’s continuous simulation by repeatedly reading the simulation state variables, to generate the discretization needed for Q-learning. The Hybrid Controller supports this effort by providing a bulk readout function that returns the simulation’s overall state with as little overhead as possible. This significantly reduces latency during the learn and control loop, and thus — as experiments have shown — improves the efficiency and convergence speed of the Q-learning algorithm. The bulk mode is activated by sending a bulk definition command that defines a readout group on the Hybrid Controller. After this has been done, the bulk read is triggered by sending a short (single

page 336

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

Hybrid Computer Approach to Train a Machine Learning System

337

character) fetch command. The following code snippet illustrates the concept and the process. analog-cartpole.py 1 2 3 4 5

#define a readout group in the Hybrid Controller hc_send(HC_CMD_BULK_DEFINE + HC_SIM_X_POS + ’;’ + HC_SIM_X_VEL + ’;’ + HC_SIM_ANGLE + ’;’ + HC_SIM_ANGLE_VEL + ’.’)

6 7 8 9 10 11 12 13 14 15 16

# when using HC_CMD_GETVAL, #HC returns "\n" # we ignore but we expect a well formed response def hc_res2float(str): f = 0 try: f = float(str.split(" ")[0]) return f except: [...]

17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33

# query the current state of the simulation, which consists # of the x-pos and the the x-velocity of the cart, the angle # and angle velocity of the pole/pendulum def hc_get_sim_state(): # bulk transfer: ask for all values that consitute the # state in a bulk using a single fetch command if HC_BULK: hc_send(HC_CMD_BULK_FETCH) (res_x_pos, res_x_vel, res_angle, res_angle_vel) = hc_receive().split(’;’) return (hc_res2float(res_x_pos), hc_res2float(res_x_vel), hc_res2float(res_angle), hc_res2float(res_angle_vel)) else: [...] analog-cartpole.py

The only way the environment in this example can be influenced by the reinforcement learning agent is via the predefined actions. In this case: “Push the cart to the left or push it to the right.”

page 337

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

M. Holzer and B. Ulmann

338

When looking at this situation from an analog computer’s viewpoint, this is a continuous operation: “To push” means that a force F that does not necessarily need to be constant over time is applied for a certain period of time. As described in Section 11.3.3.2 the implementation used here is heavily simplified: Two possible actions, one fixed and constant force F and a constant period of time where the force is applied form the two possible impulses. As the magnitude |F | of the force is configured directly at the analog computer, the Python code can focus on triggering the impulse using the two abovementioned digital outputs. analog-cartpole.py 1 2

# duration [ms] of the impulse, that influences the cart HC_IMPULSE_DURATION = 20

3 4

[...]

5 6 7 8 9 10 11

# influence simulation by using an impulse to # push the cart to the left or to the right; # it does not matter if "1" means left or right # as long as "0" means the opposite of "1" def hc_influence_sim(a, is_learning): [...]

12 13 14 15 16

if (a == 1): hc_send(HC_SIM_DIRECTION_1) else: hc_send(HC_SIM_DIRECTION_0)

17 18 19 20

hc_send(HC_SIM_IMPULSE_1) sleep(HC_IMPULSE_DURATION / 1000.0) hc_send(HC_SIM_IMPULSE_0)

21 22

[...] analog-cartpole.py

In summary it can be stated that working with an analog computer like the Model-1 with its Hybrid Controller is quite straightforward. The serial communication can be encapsulated in

page 338

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

Hybrid Computer Approach to Train a Machine Learning System

339

some functions or objects and from that moment on, the only thing that one needs to keep in mind is the completely asynchronous and parallel nature of the setup. No assumptions on certain sequence properties can be made and all the typical challenges of asynchronous parallel setups like jitter, race conditions, latency, etc. can occur and need to be managed properly.

11.4. Results Analog computers outperform digital computers with respect to raw computational power as well as with respect to power efficiency for certain problems, such as the simulation of dynamic systems that can be readily described by systems of coupled differential equations. This is due to the fact that analog computers are inherently parallel in their operation as they do not rely on some sort of an algorithm being executed in a step-wise fashion. Other advantages are their use of a continuous value representation and that fact that integration is an intrinsic function. Typical analog computer setups tend to be extremely stable and are not prone to problems like numerical instabilities, etc. Although the precision of an analog computer is quite limited compared with a stored-program digital computer employing single or even double precision floating point numbers (a precision analog computer is capable of value representation with about 4 decimal places), solutions obtained by means of an analog computer will always turn out to be realistic, something that cannot be said of numerical simulations where the selection of a suitable integration scheme can have a big effect on the results obtained. In reinforcement learning scenarios, the actual simulation of the environment as shown in Figure 11.6 on page 316 is one of the hardest and most time consuming things to do. A simulation implemented on an analog computer behaves much more realistically than one performed on a digital computer. It yields unavoidable measurement errors as would be present in a real-world-scenario, it is immune to numerical problems, etc.

page 339

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

340

b4205-v2-ch11

M. Holzer and B. Ulmann

A plethora of RL algorithms such as (Deep) Q-learning are available and well understood, as for example shown in Ref. [5]. Many Open Source implementations are available for free use. So when it comes to practially apply RL to solve challenges in rapidly advancing fields such as autonomous cars, robot navigation, coupling of human nervous signals to artificial limbs and the creation of personalized medical treatments via protein folding, a new project very often starts with the question “how can we simulate the environment for our agent to learn from it?” The authors have shown in this chapter that this challenge can be very successfully tackled using analog computers, so that a next generation of analog computers based on VLSI analog chips could help to overcome this mainstream challenge of reinforcement learning by propelling the speed and accuracy of environment simulation for RL agents to a new level. This would result in faster development cycles for RL enabled products and therefore could be one of many catalysts in transforming reinforcement learning from a research discipline to an economically viable building block for the truly intelligent device of the future.

Acknowledgement The authors would like to thank Dr. Chris Giles and Dr. David Farag´ o for proof reading and making many invaluable suggestions and corrections which greatly enhanced this chapter.

References 1. B. Ulmann, Analog and Hybrid Computer Programming (De Gruyter, 2020). 2. P. Domingos, A few useful things to know about machine learning. In Communications of the ACM, October 2012, pp. 78–87. 3. P. Domingos, The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World (Basic Books, 2015). 4. A. J. Brizard, An Introduction to Lagrangian Mechanics (World Scientific Publishing Company, 2008). 5. R. S. Sutton et al., Reinforcement Learning: An Introduction, 2nd edn. (The MIT Press, 2018). 6. C. Watkins and P. Dayan, Q-learning, Mach. Learn. 8(3–4) 279–292 (1992).

page 340

August 4, 2021

16:47

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch11

Hybrid Computer Approach to Train a Machine Learning System

341

7. F. Pedregosa et al., Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011). doi:10.1145/2347736.2347755. 8. M. Wiering et al., Reinforcement Learning (Springer, 2012). 9. C. Watkins, Learning from Delayed Rewards, Ph.D. Thesis, University of Cambridge, 1989.

page 341

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235740 0012

Chapter 12

On the Optimum Geometry and Training Strategy for Chemical Classifiers that Recognize the Shape of a Sphere Jerzy Gorecki∗ , Konrad Gizynski and Ludomir Zommer Institute of Physical Chemistry, Polish Academy of Sciences, Kasprzaka 44/52, 01-224Warsaw, Poland ∗ [email protected] In this chapter, we continue the discussion on database classifiers constructed with networks of interacting chemical oscillators. In our previous work,1, 2 we demonstrated that a small, regular network of oscillators can predict if three random numbers in the range [0, 1] describe a point located inside a sphere inscribed within the unit cube [0, 1] × [0, 1] × [0, 1] with the accuracy exceeding 80%. The parameters of the network were determined using evolutionary optimization. Here we apply the same technique to investigate if the classifier accuracy for this problem can be improved by selecting a specific geometry of interacting oscillators. We also address questions on the optimum size of the training database for evolutionary optimization and on the minimum size of the testing dataset for objective evaluation of classifier accuracy.

12.1. Introduction “Information makes the world go round” — this sentence characterizes a substantial part of human activity at the beginning of the XXI-st century.3 The tremendous progress in semiconductor technologies has been continuously stimulating the interest in information processing strategies and in computing media. When the Moore law was formulated4 probably nobody expected that it would be held for over five decades. It has been anticipated that this law

343

page 343

August 3, 2021

344

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

J. Gorecki, K. Gizynski and L. Zommer

finally breaks down, and human civilization needs other computing media to keep the progress. To answer the needs, studies on alternative information processing technologies (chemical reactions, optical systems, or quantum phenomena) have been continued for over 30 years.5–8 The research on chemical computing is strongly motivated by expectations to understand information processing by living organisms, believed to be based on chemical reactions.9 In this chapter, we are concerned with information processing using a chemical medium. The approach presented below has been inspired by the properties of a medium in which the Belousov– Zhabotinsky (BZ) reaction proceeds. The BZ-reaction has probably been the most studied chemical reaction for which a complex evolution is manifested.10–13 The reaction is the oxidation of an organic substrate by bromine compounds in an acidic environment and the presence of a catalyst. The BZ-reaction became famous because oscillations can be easily observed, as the changes in concentrations of the catalyst in different oxidation forms are reflected by the medium color. If the ferroine is used as the catalyst, then the medium is red when the reduced catalyst (Fe(phen)2+ 3 ) is dominant. The medium becomes blue for a high concentration of the catalyst in the oxidized form (Fe(phen)3+ 3 ). The reaction includes an autocatalytic production of the reaction activator (HBrO2 ). If the medium is spatially distributed and if the diffusion of the activator is allowed, then the region corresponding to a high concentration of the activator can trigger the reaction around, and a pulse of the activator propagating in space can appear. The interest in a BZ-reaction as a medium for chemical information processing comes from the fact that its properties are similar to those observed for the nerve system.9 Using a spatially distributed medium, one can form channels where the propagation of excitation pulses is observed. These pulses interact (annihilate) one with another and can change their frequency on non-excitable junctions between channels.14 There are many strategies in which the BZ-medium can be applied for computing. Information can be coded in concentrations of reagents, in the spatial structures, or in the spatio-temporal evolution.15, 16 An excitable chemical medium

page 344

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

Chemical Classifiers that Recognize the Shape of a Sphere

345

allows for easy realization of logic gates.17–21 In such gates, the input and output states are coded in the presence or the absence of an excitation at a selected point of the computing medium within a specific time interval. However, unlike the semiconductor gates that can reliably operate for years,22 the chemical logic is not that robust. In typical experimental conditions using a nonlinear reaction– diffusion medium,17, 18, 23 the time of stable operation is measured in hours; thus, the bottom-up design of chemical computers does not seem to be productive. The most effective algorithms are obtained if the chemical information processing medium works in parallel. For example, it happens if reactions at different points are coupled by the diffusion. Two classical algorithms of reaction-diffusion computing belong to such a class. One of them is the prairie-fire algorithm, which allows finding the shortest path in the labyrinth using wave propagation in an excitable medium.24, 25 The other is the Kuhnert algorithm for image processing with the light inhibited variant of the oscillatory BZ reaction.26–28 If the ruthenium complex (Ru(bpy)3 ) is used as the catalyst, then the BZ reaction is photosensitive, and illumination with the blue light produces Br− ions that inhibit the reaction.26, 29, 30 After illumination of such an oscillatory medium, excitations are rapidly damped, and the system reaches a stable, steady state. On the other hand, the oscillatory behavior re-appears immediately after the illumination is switched off.31 The existence of such external control is essential for information processing applications because it allows inputting information into the computing medium.26, 28, 32 For the analysis presented below, it is sufficient to assume that the controlling factor has an inhibiting effect. The oscillations can be terminated when it is applied, and the oscillatory behavior is quickly restored when illumination is switched off. In this chapter, following the analogy with the photosensitive BZ-reaction, we use the word illumination to describe the factor that controls oscillators. A number of reports on the information processing potential of networks composed of interacting chemical BZ-oscillators have appeared in the recent decade.33–36 Our experiments have shown that networks of oscillators formed by touching droplets containing water

page 345

August 3, 2021

346

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

J. Gorecki, K. Gizynski and L. Zommer

solution of reagents of an oscillatory BZ-reaction can be stabilized by lipids dissolved in the surrounding oil phase. High uniformity of droplets that form the structure can be achieved if droplets are formed in a microfluidic device.37 The touching droplets can communicate via the exchange of the activator. If phosphorolipids (asolectin) are used, then BZ droplets communicate mainly via the exchange of reaction activator that can diffuse through the lipid bilayers33 and transmits excitation. The experiments indicate that during a single oscillation cycle of a typical chemical oscillator we can distinguish three phases: excited, refractive, and responsive.13 Such distinction is important for the simplified model of interactions between two oscillators coupled by the exchange of reaction activator. The excited phase denotes the peak of activator concentration. An excited oscillator is able to spread out activator molecules and to speed up their production in the medium around. In the refractory phase, the concentration of inhibitor is high, and in this phase, the oscillator does not respond to activator transported from neighboring oscillators. In the responsive phase, the concentration of the inhibitor decreases. An oscillator in this phase can get excited by interactions with an oscillator in the excitation phase. These properties are reflected by the eventbased-model that allows for a fast simulation of the evolution of large oscillator networks. The event-based-model is illustrated in Figure 12.1. The results presented below are based on numerical simulations of interacting chemical oscillators. The simplest mathematical model of BZ-reaction describes the reaction as an interplay between two reagents: the activator (HBrO2 molecules) and the inhibitor (the oxidized form of the catalyst). Two variable models like Oregonator38 or Rovinsky–Zhabotinsky model39 give a pretty realistic description of simple oscillations, excitability, and the simplest spatio-temporal phenomena. However, the numerical complexity of models based on kinetic equations is still substantial, and they are too slow for large-scale simulations based on evolutionary optimization. Here, following1 we use the event-based model schematically illustrated in Figure 12.1 to simulate the time evolution of the medium. Following

page 346

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

Chemical Classifiers that Recognize the Shape of a Sphere

(a)

347

(b)

Figure 12.1. Graphical illustration of the event-based-model used to simulate the time evolution of a single oscillator and the interactions between them in the network: (a) The lengths of cycle phases (excitation, refractory and responsive) and the changes between phases when the illumination is switched on or off. (b) Interactions between the nearest neighbor oscillators: if an oscillator (A) is in the excitation phase, then, Δt = 1 s later, the excitation phase starts in all neighboring oscillators that are in the responsive phase. Oscillators that are in the refractory state do nor change its state (reprinted form1 ).

our previous work1, 2 we assumed that during the oscillation cycle, an oscillator could be in one of there phases: the excitation phase lasting 1 s, the refractive phase, lasting 10 seconds or the responsive phase that is 19 s long. For an insolated oscillator, the excitation phase appears just after the responsive phase ends and the cycle repeats. Thus the period of the cycle is 30 s. Oscillations with such period have been observed in experiments with BZ-medium.31 The separation of the oscillation cycle into refractory, responsive, and excitation phases allows introducing a simple model for interaction between oscillators. We assumed that if an oscillator is excited, then, 1 time unit later, all its neighbors in the responsive phase switch into the excitation phase. We also assume that immediately after illumination is switched on, the oscillator state changes into the refractory one. When the illumination is switched off, the excitation phase starts immediately. In the following, we are concerned with the application of a network of interacting chemical oscillators40, 41 for database classification. In the considered classification problems, the database has the form of a set of records. Each record is a (n + 1)−tuple

page 347

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

J. Gorecki, K. Gizynski and L. Zommer

348

(p1 , p2 , . . . , pn , rt) where pi are predictors represented by real or integer numbers and rt is an integer defining the record type. The classification program is supposed to output the correct record type if the predictor values are used as the input. It has been recently demonstrated that many information processing tasks can be performed by networks of interacting chemical oscillators using their specific properties.42, 43 The idea of a dataset classifier based on a network of photosensitive chemical oscillators is illustrated in Figure 12.2. The considered networks were formed by two types of oscillators (here denoted by circles): “normal” and “input” ones. It is assumed that each oscillator in the network can be individually inhibited by an external factor. We can use this factor to introduce the input information and to control the evolution of the medium. The illuminations of input oscillators are related by an affine function with the predictor value of a given record. Control A normal oscillator k is illuminated during the time interval [0,t (k) ] illum

Output Input

inp j

The output oscillator evolution gives the maximum fitness

illumination time An oscillator as the input of j-predictor (p j ) is illuminated during time interval [0, t start +(t end -t start )*p j ]

0s

100s

Figure 12.2. The idea of a database classifier constructed with a network of chemical oscillators. We assume that touching oscillators interact. The classifier is made of oscillators of two types: “normal” and “input” ones. The classification program, obtained as the result of evolutionary optimization, consists of locations of input oscillators and the set of inhibition times applied to all normal oscillators, independent of the input value. An input is provided by applying an inhibiting factor (illumination) to selected input oscillators, by the time related by an affine function with the predictor value. The number of excitations appearing at one oscillator, selected as the output one, is the classifier answer. Here and in the following figures, the output oscillator is marked by the thick border. The density of the blue color on normal oscillators increases with the time for which a given oscillator is inhibited.

page 348

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

Chemical Classifiers that Recognize the Shape of a Sphere

349

For normal oscillators, their illuminations are fixed. The normal oscillators are supposed to moderate interactions in the medium and to optimize them for a specific problem. In figures, the illumination time for normal oscillators is indicated with the density of a blue color. It is also assumed that the output information about the record type can be extracted from the number of excitations (the number of maxima of a specific reagent concentration) observed at a selected set of oscillators within a fixed interval of time. In such an approach, information processing is a transient phenomenon. It does not matter if the system approaches a stationary state over a long time or not. The classification algorithm, obtained as the result of evolutionary optimization, consists of locations of input oscillators and the output one and the set of illumination times applied to all normal oscillators regardless of the input value. Our recent results suggest that reasonably accurate database classifiers can be constructed with a small network of coupled chemical oscillators.44, 45 The results presented below are the continuation of our previous work1, 2 in which we demonstrated that networks of droplets containing reagents of BZ reaction can determine with reasonably high accuracy if three random numbers in the range [0, 1] describe a point located inside the sphere in the cube [0, 1] × [0, 1] × [0, 1] (the SIC problem). Here, we consider a similar classification problem. The difference is in the sphere radius. In the first work,1 we considered the sphere inscribed in the unit cube with the radius r = 0.5. The volume of such a sphere is 0.524. In the second work, and here we consider a bit more complex problem of a centrally located sphere with a volume equal to 0.5. The radius of such sphere is r = (3/(8π))1/3 ∼ 0.492. Now, the Shannon information of an unbiased set of points located inside and outside the sphere is 1 bit. Datasets of this classification problem are composed of records in the form of 4-tuples (p1 , p2 , p3 , rt) where pi are x -, y- and z -coordinates of a point and the value of rt is 1 if the point is inside the sphere ((p1 − 0.5)2 + (p2 − 0.5)2 + (p3 − 0.5)2 < r 2 ) and rt = 0 if the point i located outside the sphere. The SIC problem may seem academic but has some advantages. Unlike databases concerned with real problems, we can easily generate a database for the SIC with any number of

page 349

August 3, 2021

350

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

J. Gorecki, K. Gizynski and L. Zommer

elements. Therefore, we can investigate how the classifier accuracy depends on the number of records included in the process of its optimization. Moreover, using different databases, we can verify if a classifier optimized with one database retains its accuracy if another database of the same problem is used. In Ref. [1], we presented results for the sphere inscribed in the cube problem considering networks for which oscillators were located at nodes of a 2D regular network and interacted with the nearest neighbors. We considered different numbers of oscillators; from 4 (=2 × 2) to 25 (=5 × 5). The classifier accuracy was estimated by counting the number of correct answers for the datasets used during the classifier optimization. As expected, large classifiers performed better, and their accuracy was an increasing function of the classifier size. The highest measured accuracy was 87.5% for 4 × 4 classifier trained on the dataset of 200 cases. For training datasets composed of 400 records the accuracy was an increasing function and it changed from 77% of correct answers for = 2 × 2 network to 81% for = 5 × 5 network. Thus, the increase in accuracy with the network size was quite small. In Ref. [2], we considered the SIC problem with the sphere volume equal to 0.5 unit. Nine oscillators of the network were arranged in a hexagonal geometry with interactions between the nearest oscillators. The accuracy of optimized classifiers was close to 90%. It was a substantial increase if compared with networks in a regular geometry. Here, we continue the study on chemical classifiers of SIC problem and answer questions that have not been addressed in the previous work: First, there is a question if the same dataset should be used for classifier optimization and for testing its accuracy. Having in mind practical applications of chemical classifiers, we expect that after proper training, the classifier should correctly treat all cases, not only those that were included in the process of its training as it was done in Ref. [1]. Here, we introduce independent databases of different sizes and estimate what number of records is necessary for objective estimation of classifier accuracy. We discuss the problem of optimization effectiveness by considering different population sizes in the evolutionary algorithm and different numbers of elements in the

page 350

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

Chemical Classifiers that Recognize the Shape of a Sphere

351

training datasets. We also consider networks of different geometries (cf. Figure 12.7) in order to verify which network geometry can improve classifier accuracy. The chapter is arranged into four sections. Section 12.2 contains basic information on the genetic algorithm used for classifier optimization. A reader who is not familiar with the subject is advised to read our previous paper1 where the details of the method are given. In Section 12.3, we present results for oscillator-based classifiers of the SIC problem for different geometries of the network, different optimization parameters, and different sizes of datasets used for the optimization procedure. In Section 12.4, we discuss the obtained results.

12.2. The Evolutionary Optimization of Chemical Classifiers It has been demonstrated42–44 that the top-down approach based on the evolutionary optimization can be successfully applied to design classifiers based on coupled chemical oscillators. At the beginning, we have to fix the time interval [0, tmax ] within which the time evolution of the network evolution is observed, and the numbers of excitation on all oscillators are counted. All reported results were obtained for tmax = 100 s. This is an important assumption because it says that information processing is a transient phenomenon. We assume that output information can be extracted by observing the system within the time interval [0, tmax ] and it does not matter if a network reaches a steady state after a long time. Within the top-down approach, first, we specify the function that should be performed by the considered system. Next, we search for possible factors that can modify the system, increasing its information processing ability. Finally, we combine all these factors and apply them to achieve optimum performance. Here, (cf. Figure 12.2) these factors are the locations of the input and the normal oscillators and the illumination intervals for all intervals. We assume that all oscillators were inhibited at the time t = 0 and the oscillator #i remained inhibited within the time interval

page 351

August 3, 2021

352

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

J. Gorecki, K. Gizynski and L. Zommer

[0, tillum (i)]. For normal oscillators, the illumination times tillum (i) are fixed and the same for all processed records. If an oscillator #i is considered as the input one for the jth predictor, and the predictor value in the record #k is pj ∈ [0, 1] than, when processing the record #k this oscillator is inhibited (illuminated) within the time interval [0, tstart + (tend − tstart ) ∗ pj ] where the values of tstart ≤ tmax and tend ≤ tmax are the same for all predictors. Therefore, when processing the record #k, tillum (i) = tstart + (tend − tstart ) ∗ pj . In the following, we assume that tstart and tend are identical for all predictors. The symmetry of the considered problem can justify such an assumption. Therefore, a network classifier can be defined if the folowing parameters are defined: Geom — the network geometry and interactions between oscillators, tmax , Loc — location of the input oscillators, tend , tstart and tillum (i). Of course, it would be naive to believe that a randomly selected network of oscillators performs a correct classification of any database we choose. The network parameters should be optimized according to the problem we are going to solve. To do it, we select a training database of the problem and perform a complex, multi-parameter optimization of the network using an evolutionary algorithm. In the beginning, we selected a training dataset of K records and a population of M classifiers with randomly initialized parameters. Next, the fitness of each classifier was evaluated. In the previous papers1, 2, 42, 43 we measured the fitness using the mutual information between the set of record types and the set of the number of excitation observed on oscillators. The mutual information between two sets46 is the quantity that describes how much information about an element of one set can be gained if we know the element of the other. Let us assume that the training dataset is D and consider its classifier C. For each record r (r ∈ D) characterized by the record type rt we can detect the number of excitations orj observed on the oscillator j of the network. For each oscillator j of the classifier C we calculate the mutual information between the sets R = {rt, r ∈ D} and Oj = {orj , r ∈ D}. The value

page 352

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

Chemical Classifiers that Recognize the Shape of a Sphere

353

I(R, Oj ) is: I(R, Oj ) = (H(R) + H(Oj ) − H(R, Oj ), where H(R) and H(Oj ) are the Shannon entropies47 of the set of output types in the training dataset and of the set of observed on the oscillator j of the network and H(Si , Po ) is the joint entropy of both these sets. We select the output oscillator as the one for which I(R, Oj ) has the maximum. The value I(R, O) = maxj I(R, Oj ) defines the fitness of the classifier C. Alternatively, we can use the classifier accuracy as the measure of its fitness. Let us consider the oscillator #j and a particular number of its excitations o. We can introduce two sets: Oj0 = {r; orj = o∧ r = 0, r ∈ D} and Oj1 = {r; orj = o ∧ r = 1, r ∈ D}. If the set Oj0 has more elements than the set Oj1 then it is more likely to that the record corresponding to a point outside the sphere is processed when o excitations at the oscillator j are observed. If the set Oj1 has more elements than the set Oj0 then the probability that the considered record describes a point inside the sphere is higher. Considering all possible numbers of excitation observed on the oscillator j we can calculate the accuracy of classification Aj if the oscillator j is used as the output one. The maximum maxj Aj defines the classifier accuracy and the oscillator j at which it is observed is considered as the output one. The value of A = maxj Aj can be regarded as the alternative definition of classifier fitness. Let us notice that in order to calculate the fitness, we have to study the evolution of the network on all elements of the training dataset. It means that the training dataset should not be too large and, moreover, that we need a fast algorithm to describe the network time evolution, including possible interaction between oscillators. When we decide about the definition of the classifier fitness, we can initiate the evolutionary optimization49, 50 of the initially generated population of M classifiers. The next generation of a population of classifiers is obtained as the results of the following operations1 : The upper 5% of the most fit classifiers were copied to the next generation. The remaining 95% of elements of the

page 353

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

J. Gorecki, K. Gizynski and L. Zommer

354

A inp 3 inp 1

inp 3 inp 2

inp 1

inp 1

B

inp 1 inp 1 inp 2

inp 2 inp 3 inp 1 inp 2 inp 2

inp 3 inp 1 inp 2 inp 2

Figure 12.3. A schematic illustration of steps leading to the generation of an Offspring from two Parents. First, the Recombination step is performed: randomly selected points A and B in the structure of Parent 1 mark a parallelogram, which is copied, along with the illumination interval for inputs, to the Offspring. The other part of the Offspring comes from Parent 2. Then, during the Mutation of an Offspring, oscillator types, and initial illumination times and the function that translates the predictor value into input oscillator illumination time are subject to mutations. The intensity of blue color in each oscillator indicates the illumination time for normal oscillators.

next generation were generated by recombination and mutation processes applied to pairs of classifiers randomly selected from the upper 50% of the fittest ones. The operations are illustrated in Figure 12.3. Recombination of two Parents produces a single Offspring by combining random parts of their body.48 Next, we applied mutation operations on the Offspring. We assumed that an

page 354

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

Chemical Classifiers that Recognize the Shape of a Sphere

355

input oscillator can change into a normal oscillator and vice versa. Also, illumination times of normal oscillators and the parameters of the function that translates the predictor value into the illumination time of an input oscillator tstart , tend are subject of mutations. The probabilities of mutations are selected such that, on average, a single mutation occurs in each clasifier within a single evolution step. Since the fitness changes during evolution, the position of the output is not fixed and also can change from generation to generation. We do not exclude the case in which an input oscillator is used as the output one. As a result of trial and error, the fitness of the best classifier increases with the number of generations. The optimization was repeated until the preassumed final number of generations was reached. The definitions of classifier fitness given above are not equivalent. The maximum fitness: H(R) when the mutual information is used and A = 1 if we use the accuracy both describe a perfect classifier. However, the optimizations for I(R, O) and the optimization for A are not equivalent. It can be shown that an increase in I(R, O) can decrease A and vice versa.44 12.3. Results Here, we would like to answer questions that have not been addressed in the previous papers. In our first report on the SIC problem1 we estimated the network accuracy on the basis of results obtained for the training dataset. Test datasets of different sizes, different from the training dataset, were used in Ref. [2] to estimate the classifier accuracy. As can be expected, the accuracy estimated using the training dataset was higher than the values obtained with the test datasets. Here, in order to obtain more precise information on the size of the dataset needed for accurate estimation of classifier quality, we calculated the mutual information and the accuracy using a few testing datasets with the same number of elements. The comparison between the mutual information and the accuracy obtained for the training dataset and for testing datasets of different sizes is presented in Figure 12.4. We considered a network of seven oscillators arranged in a hexagonal geometry shown in Figure 12.4(a).

page 355

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

J. Gorecki, K. Gizynski and L. Zommer

356

(a)

1

0

2

4 3

(b) 500

400

3

300

6

5

4

(c)

1

500

400

300

outside inside

200

200

100

100

0

0

2

outside inside

0.40

0.38 0.37 0.36 0.35 0

10 000

20 000

30 000

40 000

50 000

The size of testing database 0.850

(e) 0.845

Accuracy

Mutual information

(d) 0.39

0.840

0.835

0.830 0

10 000

20 000

30 000

40 000

The size of testing database

Figure 12.4.

(Continued)

50 000

page 356

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

Chemical Classifiers that Recognize the Shape of a Sphere

357

The blue numbers identify individual oscillators. The input oscillators are marked by “in x ”, “in y” and “in z ”. The same identification of oscillators is used in Figures 12.5–12.7. The touching oscillators are interacting. The illumination of the central, normal oscillator was tillum(3) = 11.95 s. The values of parameters describing the relationship between were: tstart = 68.44 s and tend = 83.63. The structure shown in Figure 12.4(a) was obtained after 700 evolutionary steps of optimization towards the maximum accuracy. The population of M = 1000 classifiers was optimized using a training dataset of K = 1000 elements. The optimization program produced a network structure with an interesting symmetry. The inputs of x (the oscillator #6), of y (the oscillator #1), and of z (the oscillator #2) are symmetrically distributed around the central oscillator that is the output one. These inputs are separated by the inputs of the z-coordinate (oscillators #0,4 and 5). In Figure 12.4(a), the blue color intensity on normal oscillators increases with their illumination time. The mutual information I(R, Oj ) is marked with red color in the form of a pie chart where the sector size is normalized to the maximal value of mutual information — the entropy of training dataset (0.992 bit). The output oscillator is marked with a wide black border. Figure 12.4(b) illustrates the distribution of excitation numbers observed on the central oscillator (#3) for all ←−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− Figure 12.4. The comparison between the mutual information and the accuracy obtained for the training dataset and the testing datasets of different sizes. (a) shows the structure of the considered network. The blue numbers identify individual oscillators. The touching oscillators are interacting. The blue color intensity corresponds to illumination time of the normal oscillator as a fraction of tmax . The mutual information I(R, Oj ) is marked with red color in the form of a pie chart where the sector size is normalized to the entropy of the training dataset. The output oscillator is marked with a wide black border. (b) and (c) illustrate the distribution of excitation numbers observed on the oscillators #3 and #1 for all records from the training dataset. The red and green bars correspond to points located inside and outside the sphere, respectively. (d) and (e) show the mutual information and the accuracy calculated for the training dataset (large black dots), three different testing datasets of different sizes in the range from 2000 to 50,000 records (red, blue, and green dots), and a large test dataset of 500,000 records (the black line).

page 357

August 3, 2021

358

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

J. Gorecki, K. Gizynski and L. Zommer

records from the training dataset. The red and green bars correspond to points located inside and outside the sphere, respectively. As seen in Figure 12.4(b) for the majority of points located inside the sphere the network produced three excitations on the central oscillator. On the other hand, for the majority of points located outside the sphere four excitations of the central oscillator were observed. The mutual information between the outputs of the training dataset and the number of excitation is 0.397 (the large black dot in Figure 12.4(d)). The classification rule: if three excitations at the central oscillator are observed, then the point is located within the sphere and if four excitations at the central oscillator are observed, then the point located outside the sphere leads to the accuracy of 0.849 (the large black dot in Figure 12.4(e)). Figure 12.4(c) illustrates the distribution of excitation numbers observed on the oscillator #1 for all records from the training dataset. Again if one excitation was observed, then the probability of a record corresponding to a point inside the sphere was larger than the probability of a record corresponding to a point outside the sphere. On the other hand, if two excitations were observed, then with a high probability, we can claim that the processed record corresponded to a point outside the sphere. However, the accuracy of the classifier with the output oscillator #1 is low and equals 0.656. In order to see if the mutual information and the accuracy are correctly estimated using the training dataset containing only 1000 records, we have calculated both quantities using testing datasets of different sizes in the range from 2000 to 50,000 records. We used three different test datasets for each size, and the results are marked by dots of different colors in Figures 12.4(d) and 12.4(e). Moreover, we calculated the mutual information and the accuracy using the test dataset of 500,000 records. These results are considered as the most objective estimation of the classifier quality. They are shown using a horizontal black line in Figures 12.4(d) and 12.4(e). The values obtained for the test dataset of 500,000 records are slightly lower than those estimated for the training dataset. The results demonstrate that in the case of the SIC problem, in order to get

page 358

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

Chemical Classifiers that Recognize the Shape of a Sphere

359

100

(a) 0.60

Mutual information

0.55 0.50 0.45 0.40

200

0.35 0.30 1

10

100

1000

104

Evolution step

500

1000

2000 (b)

200

0.90 0.88

Accuracy

0.86 0.84

500

0.82 0.80 1

2000

100

1000 10

100

1000

104

Evolution step Figure 12.5. The progress of the classifier optimization for different sizes of the training dataset from K = 100 to K = 2000 records. The population of M = 1000 individuals was used. The accuracy was optimized for 40,000 of evolution steps. (a) and (b) show the mutual information and the accuracy as functions of the number of evolutionary steps. In both figures, the line color codes the size of the training dataset. The structures of classifiers obtained for different values of K are shown in between (a) and (b).

page 359

August 3, 2021

360

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

J. Gorecki, K. Gizynski and L. Zommer

less than 2% error for the mutual information and less than 1% error for the accuracy, one should use the test dataset of 50,000 records or more. Now let us consider the problem of the size of the training dataset needed for the optimization of a classifier of the SIC problem. In evolutionary optimization, we have to calculate the fitness of each classifier in the population. In order to calculate the fitness, we have to study the classifier answer to all elements of the training dataset. Therefore, assuming a fixed size of the classifier population, the numerical complexity of optimization linearly depends on the number of records in the training dataset. In calculations, we considered the classifier composed of seven oscillators in the geometry illustrated in Figure 12.4(a). We considered training datasets with different numbers of records K from 100 to 2000. In each case, the population of M = 1000 classifiers was optimized for 40,000 evolutionary steps to maximize the accuracy. In both Figures 12.5(a) and 12.5(b) the line color codes the size of the training dataset K. In all cases, the major increase in both accuracy and mutual information is observed within the first 1000 of evolution steps. Only a small improvement of the accuracy was observed at the later stage of optimization. The structures of optimized classifiers are shown in between Figures 12.5(a) and 12.5(b). The normal output oscillator is always in the center. For all considered training datasets except of K = 500 we obtained structures symmetric with respect to x-, y- and z-inputs. The other parameters describing the presented classifiers are as follows: • for K = 100: tstart tillum (3) = 38.15 s, accuracy measured I(R, O) = 0.513. • for K = 200: tstart tillum (3) = 17.86 s, accuracy measured I(R, O) = 0.533.

= 67.32 s, tend = 86.07, tillum (0) = 51.92 s, tillum(4) = 38.62 s, tillum(5) = 35.23 s. The using the training dataset was 0.890 and = 67.18 s, tend = 84.81, tillum (0) = 19.51 s, tillum(4) = 43.35 s, tillum(5) = 25.40 s. The using the training dataset was 0.900 and

page 360

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

Chemical Classifiers that Recognize the Shape of a Sphere

361

• for K = 500: tstart = 67.68 s, tend = 84.14, tillum (0) = 51.92 s, tillum (3) = 11.94 s. The accuracy measured using the training dataset was 0.882 and I(R, O) = 0.482. • for K = 1000: tstart = 67.76 s, tend = 83.07, tillum (3) = 11.94 s. The accuracy measured using the training dataset was 0.881 and I(R, O) = 0.478. • for K = 2000: tstart = 68.11 s, tend = 85.09, tillum (0) = 52.94 s, tillum (3) = 21.43 s, tillum(4) = 29.33 s, tillum(5) = 14.31 s. The accuracy measured using the training dataset was 0.870 and I(R, O) = 0.445. The fact that the accuracy was larger for small training datasets can be explained by the fact that it is easier to get a high accuracy for a small training dataset. If the accuracy was evaluated using a large testing dataset (50,000 records; as seen in Figure 12.4(e) the evaluation with a test dataset of this size was expected to be objective) we obtained: 0.812, 0.863, 0.857, 0.874 and 0.860 for K = 100, 200, 500, 1000 and K = 2000, respectively. Therefore, as expected, the accuracy of the classifier optimized with the smallest training dataset was the smallest one. On the other hand, the qualities of all other classifiers are at a similar level. We conclude that the training dataset of K ≥ 200 elements includes a sufficient number of records to reflect the SIC problem correctly. The numerical complexity of the optimization program linearly depends on the size of the optimized population M . However, it can be expected that optimization is more efficient if the population is large. We investigated classifier optimization for M in the range from 50 to 500 considered populations of 50, 80, 100, 200 and 500 classifiers. Figure 12.6 illustrates the mutual information (a) and the classifier accuracy (b) as functions of the number of evolution steps for different numbers of elements included in the population of classifiers. In calculations, we considered the classifier composed of seven oscillators in the geometry illustrated in Figure 12.4(a). In each case, the classifiers were optimized for 20,000 evolutionary steps to maximize accuracy. The training dataset of K = 1000 records was used. In both Figures 12.6(a) and 12.6(b) the line color codes the size

page 361

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

J. Gorecki, K. Gizynski and L. Zommer (a)

0.50

Mutual information

362

0.45

50 200 500

0.40 0.35 0.30

50

100

0.25

80

80

0.20 1

10

100

104

1000

Evolution step

100

200

500

Accuracy

(b) 0.90

200

0.85

0.80

500 100

50

0.75

80 0.70 1

10

100 1000 Evolution step

104

Figure 12.6. The progress of the classifier optimization for different populations of classifiers from M = 50 to M = 500. The training dataset of K = 1000 records was used. The accuracy was optimized for 20,000 of evolution steps. (a) and (b) show the mutual information and the accuracy as functions of the number of evolutionary steps. In both figures, the line color codes the size of the training dataset. The structures of classifiers obtained for different values of K are shown in between (a) and (b).

of optimized population M . Like in Figure 12.5, the rapid increase in the accuracy is observed at the beginning of optimization, here within the first 2000 steps of evolution. The parameters describing the optimized classifiers are as follows:

page 362

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

Chemical Classifiers that Recognize the Shape of a Sphere

363

• for M = 50: tstart = 23.40 s, tend = 66.52, tillum (1) = 39.45 s, tillum (2) = 52.82 s, tillum(3) = 5.62 s, tillum (6) = 12.88 s. The accuracy measured using the training dataset was 0.813 and I(R, O) = 0.331. • for M = 80: tstart = 41.39 s, tend = 58.63, tillum (1) = 49.66 s, tillum (3) = 27.71 s, tillum(5) = 46.91 s, tillum(6) = 53.19 s. The accuracy measured using the training dataset was 0.875 and I(R, O) = 0.462. • for M = 100: tstart = 67.14 s, tend = 83.67, tillum (2) = 44.33 s, tillum (3) = 19.30 s, tillum(4) = 12.46 s, tillum(6) = 11.52 s. The accuracy measured using the training dataset was 0.0.881 and I(R, O) = 0.474. • for M = 200: tstart = 67.67 s, tend = 85.78, tillum(1) = 8.57 s, tillum (2) = 14.07 s, tillum(3) = 20.18 s, tillum(6) = 16.01 s. The accuracy measured using the training dataset was 0.869 and I(R, O) = 0.439. • for M = 500: tstart = 67.63 s, tend = 85.46, tillum (0) = 13.65 s, tillum (3) = 22.84 s, tillum(4) = 19.29 s, tillum(5) = 29.69 s. The accuracy measured using the training dataset was 0.874 and I(R, O) = 0.453. It can be seen that both the mutual information (0.33) and the classifier accuracy (0.813) obtained for the population of M = 50 classifiers are significantly lower than those obtained for the other populations. For the majority of populations, the optimized structures are symmetrical. Non-symmetrical structures were observed for populations of 80 and 100 classifiers. Nevertheless, both the accuracy and the mutual information for M = 80 and 100 measured on the training datasets are similar to those observed for classifiers optimized with populations of M = 200, 500 and M = 1000 shown in Figure 12.5. The values of the mutual information and the classifier accuracy decrease if we use a large testing dataset (here of 500,000 elements). For this test dataset we obtain: A = 0.785 and I(R, O) = 0.284 for M = 50, A = 0.857 and I(R, O) = 0.410 for M = 80, A = 0.864 and I(R, O) = 0.434 for M = 100, A = 0.864 and I(R, O) = 0.425 for

page 363

August 3, 2021

364

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

J. Gorecki, K. Gizynski and L. Zommer

M = 200, A = 0.868 and I(R, O) = 0.437 for M = 500. These results suggest that within the used optimization procedure, the optimized population should contain M ≥ 100 elements to achieve the goal. Finally, we studied the problem if the accuracy of a classifier for the SIC problem can be improved by selecting a special geometry of the network. Figure 12.7 presents results for networks of classifiers characterized by different geometries. We optimized parameters of a small (T4) and a large triangle (T10), of a David shield structure made of 13 oscillators (DS), and of a regular geometry of 4 × 4 = 16 oscillators (REG). The considered geometries are shown in the mid of the figure. The numbers inside the circles identify individual oscillators. We assume that touching oscillators interact. Almost all considered geometries (excluding the regular one) allow for a symmetrical distribution of the inputs to the anticipated output oscillator located in the center. For T4, the lines indicate interactions between the oscillators located at the triangle corners. The line color in Figure 12.7(a) and 12.7(b) codes the network type. In each case, the population of M = 1000 classifiers was optimized using the training dataset of K = 1000 records for 20,000 evolutionary steps to maximize the mutual information. The parameters describing the optimized classifiers are as follows: • for T4: tstart = 67.75 s, tend = 83.63, tillum (1) = 11.73 s. The classification rule is: three excitations on the output oscillator represent a point inside the sphere, four excitations correspond to a point outside the sphere. The accuracy measured using the training dataset was 0.884 and I(R, O) = 0.487. • for T10: tstart = 67.85 s, tend = 82.86, tillum(4) = 41.32 s, tillum (6) = 31.72 s, tillum (8) = 23.43 s. The classification rule is: two excitations on the output oscillator represent a point inside the sphere, three excitations correspond to a point outside the sphere. The accuracy measured using the training dataset was 0.889 and I(R, O) = 0.497. • for DS: tstart = 82.68 s, tend = 67.93, tillum (4) = 31.13 s, tillum (6) = 11.44 s, tillum (9) = 46.60 s, tillum (11) = 45.23 s. The classification rule is: three excitations on the output oscillator represent a point

page 364

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

Chemical Classifiers that Recognize the Shape of a Sphere

Mutual information

0.55

365

REG

(a)

0.50 0.45

DS

4 6

0.40

T10

0.35

T4

8 T10

0.30 0.25 1

10

100

1000

REG

Evolution step

DS

0 1

4 9

3

4

6 11

1

T4 0.92

104

9

11

12 13 14 15

(b)

REG

0.90 Accuracy

0.88 0.86 0.84

DS T4

0.82 0.80

T10

0.78 1

10

100

1000

104

Evolution step

Figure 12.7. The progress of the classifier optimization for classifiers characterized by different network geometry. The population of M = 1000 classifiers was considered, and the training dataset of K = 1000 records was used. The population was optimized for the maximum mutual information for 20,000 of evolution steps. (a) and (b) show the mutual information and the accuracy as functions of the number of evolutionary steps. In both figures, the line color codes the network geometry. The structures of classifiers obtained for different geometries are shown in between (a) and (b). The numbers inside the circles identify individual oscillators.

page 365

August 3, 2021

366

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

J. Gorecki, K. Gizynski and L. Zommer

inside the sphere, four or five excitations correspond to a point outside the sphere. The accuracy measured using the training dataset was 0.891 and I(R, O) = 0.504. • for REG: tstart = 83.34 s, tend = 67.10, tillum (0) = 41.38 s, tillum (1) = 54.76 s, tillum (3) = 8.80 s, tillum (4) = 26.31 s, tillum (9) = 42.07 s, tillum (11) = 57.47 s, tillum (12) = 28.50 s, tillum (13) = 11.62 s, tillum (14) = 16.08 s, tillum (15) = 48.18 s. The classification rule is: four excitations on the output oscillator represent a point inside the sphere, five or six excitations correspond to a point outside the sphere. The accuracy measured using the training dataset was 0.903 and I(R, O) = 0.554. For almost all considered geometries (except REG), the mutual information was close to its maximum value within the first 1000 steps of optimization, as it was observed in the results presented in Figures 12.5 and 12.6. The values of mutual information and of the accuracy, measured on the training dataset, show the anticipated increase with the number of oscillators forming the classifier. The trend is confirmed by the values of I(R, O) and the accuracy A obtained using a large test dataset (500,000 records). Here, I(R, O) = 0.453, 0.468, 0.470 and 0.513 for T4, T10, DS and REG respectively. Similarly, the values of A are 0.872, 0.877, 0.880 and 0.893. Therefore we can see that both the mutual information and the accuracy are increasing with the number of oscillators forming the network. Nevertheless, the increase is slow if compared with the increase in the number of oscillators considered and related increase in the numerical complexity of optimization. 12.4. Conclusions In this chapter, we have answered a few technical questions related to the optimization of classifiers for the SIC problem that are based on a network of chemical oscillators. In our simulations, we used a simplified and fast to simulate event-based-model illustrated in Figure 12.1. Network optimization was done by a simple evolutionary algorithm. We have found that it is relatively easy to obtain a classifier that determines the point location with an accuracy higher

page 366

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

Chemical Classifiers that Recognize the Shape of a Sphere

367

than 85 %, and such accuracy can be achieved by a network formed by only four oscillators. The number of records in the training dataset K and the size of the optimized population M are important parameters of the program and they have a direct impact on the computation time. We have found that reasonable results for the SIC problem can be obtained when the both K and M are in the order of several hundred. However, for objective estimation of classifier accuracy (say with an error below 1%) the classifier should be tested on much larger datasets containing K ≥ 50,000 records. It is worthwhile to mention that we did not observe a significant influence of the size of a dataset used for optimization on the classifier accuracy. The results obtained for training datasets in the range between 200 and 2000 records were almost identical. This can be explained using the following argument. It seems that networks of the considered medium are not able to classify correctly the SIC problem because nor of the classifiers showed accuracy exceeding 90%. To achieve the accuracy of 85+% it is not necessary to use a training database that describes the problem with high accuracy. For such accuracy, an approximation of SIC problem by a training dataset of 200 records seems to be sufficient to introduce the correlations in the oscillator network that reflect the data structure. The improvement of classifier accuracy by selecting the proper network geometry was the main motivation for our study. We considered networks of different geometry, and some of them reflected the symmetry of the problem. The results are rather negative. Nor of the structures seem to perform much better than the others. In all cases, we achieved accuracy close to ∼87%, but we were not able to increase the accuracy above 90%. We believe that the limitation can be linked to unflexibility of the event-based-model. The future improvement of chemical classifiers seem to be possible if we use more realistic models of oscillators and intractions between them, that include parameters allowing control on these processes. Acknowledgments The work was supported by the Polish National Science Centre grant UMO-2014/15/B/ST4/04954.

page 367

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

368

b4205-v2-ch12

J. Gorecki, K. Gizynski and L. Zommer

References 1. K. Gizynski and J. Gorecki, A chemical system that recognizes the shape of a sphere. Comput. Meth. Sci. Technol. 22(4), 167–177 (2016). 2. L. Zommer, K. Gizynski, and J. Gorecki, On the predictive power of database classifiers formed by a small network of interacting chemical oscillators. Int. J. Unconv. Comput. 14(2), 159–179 (2019). 3. The Guardian article https://www.theguardian.com/environment/2017/ dec/11/tsunami-of-data-could-consume-fifth-global-electricity-by-2025 (2017). 4. G. E. Moore, Cramming more components onto integrated circuits. Proc. IEEE. 86(1), 82–85 (1998). For recent data see for example http://en. wikipedia.org/wiki/Moores -law. 5. T. Gramss, S. Bornholdt, M. Gross, M. Mitchell, and T. Pellizzari, Nonstandard Computation: Molecular Computation: Cellular Automata — Evolutionary Algorithms — Quantum Computers (Wiley-VCH Verlag GmbH & Co. KGaA, 1998). 6. C. Calude and G. Paun, Computing with Cells and Atoms: An Introduction to Quantum, DNA and Membrane Computing (CRC Press, 2000). 7. A. Adamatzky, L. Bull, and B. D. L. Costello, Unconventional computing (L. Press, 2007). 8. A. Adamatzky, Advances in Unconventional Computing (Springer, 2017). 9. H. Haken, Brain Dynamics: Synchronization and Activity Patterns in PulseCoupled Neural Nets with Delays and Noise. (Springer, Berlin Heidelberg, Germany, 2002). 10. B. P. Belousov, in Collection of Short Papers on Radiation Medicine (Medgiz, Moscow, 1959), pp. 145–152. 11. A. M. Zhabotinsky, Periodic liquid phase reactions. Proc. Acad. Sci. USSR 157, 392–395 (1964). 12. J. J. Tyson, What everyone should know about the BelousovZhabotinsky reaction. In S.A. Levin (ed.), Frontiers in Mathematical Biology (Springer, Berlin Heidelberg, Germany, 1994), pp. 569–587. 13. I. R. Epstein and J. A. Pojman, An Introduction to Nonlinear Chemical Dynamics: Oscillations, Waves. Patterns, and Chaos (Oxford University Press, New York, NY, USA, 1998). 14. J. Sielewiesiuk and J. Gorecki, Passive barrier as a transformer of chemicasigna frequency. J. Phys. Chem. A 106, 4068–4076 (2002). 15. F. Muzika, L. Schreiberova and I. Schreiber, Chemical computing based on turing patterns in two coupled cells with equal transport coefficients. RSC Adv. 4(99), 56165–56173 (2014). 16. F. Muzika, L. Schreiberova and I. Schreiber, Discrete turing patterns in coupled reaction cells in a cyclic array. React. Kinet. Mech. Catal. 118(1), 99–114 (2016). 17. A. Toth and K. Showalter, Logic gates in excitable media. J. Chem. Phys. 103(6), 2058–2066 (1995). 18. O. Steinbock, P. Kettunen, and K. Showalter, Chemical wave logic gates. J. Phys. Chem. 100(49), 18970–18975 (1996).

page 368

August 3, 2021

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

Chemical Classifiers that Recognize the Shape of a Sphere

369

19. I. N. Motoike and K. Yoshikawa, Information operations with an excitable field. Phys. Rev. E 59(5), 5354–5360 (1999). 20. J. Sielewiesiuk and J. Gorecki, Logical functions of a cross junction of excitable chemical media. J. Phys. Chem. A 105(35), 8189–8195 (2001). 21. A. Adamatzky, B. D. L. Costello, and T. Asai, Reactiondiffusion Computers (Elsevier, New York, NY, USA, 2005). 22. R. P. Feynman, In A. J. G. Hey, R. W. Allen, (eds.), The Feynman Lectures on Computation (Addison-Wesley, MA, USA, 1996). 23. A. Adamatzky and B. D. L. Costello, Experimental logical gates in a reactiondiffusion medium: The XOR gate and beyond. Phys. Rev. E 66(4), 046112 (2002). 24. O. Steinbock, A. Toth, and K. Showalter, Navigating complex labyrinths optimal paths from chemical waves. Science. 267(5199), 868–871 (1995). 25. K. Agladze, N. Magome, R. Aliev, T. Yamaguchi, and K. Yoshikawa, Finding the optimal path with the aid of chemical wave. Physica D. 106(3–4), 247– 254 (1997). 26. L. Kuhnert, A new optical photochemical memory device in a light-sensitive chemical active medium. Nature. 319(6052), 393–394 (1986). 27. L. Kuhnert, K. I. Agladze, and V. I. Krinsky, Image processing using lightsensitive chemical waves. Nature. 337(6204), 244–247 (1989). 28. N. G. Rambidi and A. V. Maximychev, Towards a biomolecular computer. Information processing capabilities of biomolecular nonlinear dynamic media. BioSystems. 41(3), 195–211 (1997). 29. H.-J. Krug, L. Pohlmann, and L. Kuhnert, Analysis of the modified complete oregonator accounting for oxygen sensitivity and photosensitivity of Belousov-Zhabotinsky systems. J. Phys. Chem. 94(12), 4862–4866 (1990). 30. S. Kdr, T. Amemiya, and K. Showalter, Reaction mechanism for light sensitivity of the Ru(bpy)32+-catalyzed Belousov-Zhabotinsky reaction. J. Phys. Chem. A 101(44), 8200–8206 (1997). 31. K. Gizynski and J. Gorecki, Chemical memory with states coded in light controlled oscillations of interacting BelousovZhabotinsky droplets. Phys. Chem. Chem. Phys. 19(9), 6519–6531 (2017). 32. K. Yoshikawa, I. Motoike, T. Ichino, T. Yamaguchi, Y. Igarashi, J. Gorecki, and J. N. Gorecka, Basic information processing operations with pulses of excitation in a reaction-diffusion system. Int. J. Unconv. Comput. 5(1), 3–37 (2009). 33. J. Szymanski, J. N. Gorecka, Y. Igarashi, K. Gizynski, J. Gorecki, K.-P. Zauner, and M. De Planque, Droplets with information processing ability. Int. J. Unconv. Comput. 7(3), 185–200, (2011). 34. J. Holley, I. Jahan, B. D. L. Costello, L. Bull, and A. Adamatzky, Logical and arithmetic circuits in Belousov-Zhabotinsky encapsulated disks. Phys. Rev. E 84(5), 056110 (2011). 35. M. Kuze, M. Horisaka, N. J. Suematsu, T. Amemiya, O. Steinbock, and S. Nakata, Chemical wave propagation in the Belousov-Zhabotinsky reaction controlled by electrical potential. J. Phys. Chem. A 123(23), 4853–4857 (2019).

page 369

August 3, 2021

370

17:51

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch12

J. Gorecki, K. Gizynski and L. Zommer

36. I. L. Mallphanov and V. K. Vanag, Fabrication of new Belousov-Zhabotinsky micro-oscillators on the basis of silica gel beads. J. Phys. Chem. A 124(2), 272–282 (2020). 37. J. Guzowski, K. Gizynski, J. Gorecki, and P. Garstecki, Microfluidic platform for reproducible self-assembly of chemically communicating droplet networks with predesigned number and type of the communicating compartments. Lab Chip. 16(4), 764–772 (2016). 38. R. J. Field and R. M. Noyes, Oscillations in chemical systems. IV. Limit cycle behavior in a model of a real chemical reaction. J. Chem. Phys. 60(5), 1877–1884 (1974). 39. A. B. Rovinsky, and A. M. Zhabotinsky, Mechanism and mathematical model of the oscillating bromate-ferroin-bromomalonic acid reaction. J. Phys. Chem. 88, 6081–6084 (1984). 40. G. Gruenert, J. Szymanski, J. Holley, G. Escuela, A. Diem, B. Ibrahim, A. Adamatzky, J. Gorecki, and P. Dittrich, Multiscale modelling of computers made from excitable chemical droplets. Int. J. Unconv. Comput. 9(3–4), 237– 266 (2013). 41. G. Gruenert, K. Gizynski, G. Escuela, B. Ibrahim, J. Gorecki, and P. Dittrich, Understanding computing droplet networks by following information flow. Int. J. Neur. Syst. 25(7), 1450032 (2015). 42. K. Gizynski, G. Gruenert, P. Dittrich, and J. Gorecki, Evolutionary design of classifiers made of droplets containing a nonlinear chemical medium. Evol. Comput. 25(4), 643–671 (2017). 43. K. Gizynski and J. Gorecki, Cancer classification with a network of chemical oscillators. Phys. Chem. Chem. Phys. 19(42), 28808–28819 (2017). 44. J. Gorecki, Applications of information theory methods for evolutionary optimization of chemical computers. Entropy. 22(3), 313 (2020). 45. J. Gorecki and A. Bose, How does a simple network of chemical oscillators see the Japanese flag?. Front. Chem. 8, 899 (2020). 46. T. M. Cover and J. A. Thomas, Elements of Information Theory (WileyInterscience, New York, 2006). 47. C. E. Shannon, A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 and 623–656 (1948). 48. D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, 1st edn. (Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1989). 49. D. B. Fogel, An introduction to simulated evolutionary optimization. IEEE Trans. Neural Netw. 5(1), 3–14 (1994). 50. K. Weicker, Evolutionare Algorithmen (Springer, Berlin/Heidelberg, Germany, 2007).

page 370

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235740 0013

Chapter 13

Sensing and Computing with Liquid Marbles Andrew Adamatzky∗,†,‡ , Benjamin de Lacy Costello† , Thomas C. Draper∗ , Claire Fullarton∗ , Richard Mayne∗ , Neil Phillips∗ , Michail-Antisthenis Tsompanas∗ and Roshan Weerasekera∗ ∗

Unconventional Computing Lab, UWE, Bristol, UK Department of Applied Sciences UWE, Bristol, UK ‡ [email protected]

†

Liquid marbles (LMs) are droplets of liquid coated with hydrophobic power. The LMs behave as soft bodies and are easy to manipulate. We introduce experimental laboratory prototypes of sensing and computing devices made with LMs. The devices include collision-based logical gates implemented via ballistic interaction of LMs, photo- and thermo-sensing devices made of LMs with Belousov–Zhabotinsky (BZ) medium as a cargo, BZ LMs robotic controllers and neuromophic LM with carbon nanotubes. The prototypes developed could be used as components of soft and liquid electronics and computing devices.

13.1. Introduction Liquid marbles (LMs) were first reported by Aussillous and Quéré in 2001,1 and have since become increasingly popular in chemistry, particularly condensed matter. They are composed of two parts: a microliter sized core of liquid (usually water), surrounded by a powder coating. This gives them their other name “particle-coated droplets”. A typical volume of a LM is 10 μL, which results in a typical diameter of 3 mm. A schematic of a LM is shown in Figure 13.1. We shall look at both the core and coating in turn. 371

page 371

August 4, 2021

372

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

Core (Water etc.) Coating (PTFE etc.)

Figure 13.1. A schematic diagram of a LM. The core is generally comprised of water or glycerol, and the hydrophobic powder coating could be PTFE, PE, lycopodium grains, etc. A typical diameter of the entire LM is 3 mm, whilst the powder particle sizes could be 10 nm to 400 µm in diameter. Note the nonhomogeneous coating.

The bulk of a LM is composed of the core. This microliter droplet is made of water generally (because its high surface tension allows for the easiest LM formation), though glycerol is quite common,2 and even petroleum has been used.3 This chapter will focus on waterfilled LMs. The coating of a LM is comprised of a micro- or nanosized powder, that (for water cores) is hydrophobic. “Hydrophobic” comes from the Ancient Greek “fear of water”. Chemically, hydrophobic powders are normally lacking in polar intramolecular bonds, which results in few intermolecular hydrogen-bonds forming between the water and the substrate. It is this shortage of attractive forces that is often (mistakenly) portrayed as a water-repelling repulsive force. Common examples of LM powder coatings include polytetrafluoroethylene (PTFE),4 polyethylene (PE)5 and modified-lycopodium grains.1 A variety of possible powder coatings are demonstrated in Figure 13.2. Note the difference in particle size, especially in the hybrid example shown in Figure 13.2(d). This (in combination with the powders degree of hydrophobicity) gives rise to very different characteristics of LM lifetime, ruggedness and hysteresis. For a recent overview on these dependencies, see reference.6 As can be seen in Figures 13.1 and 13.2, the coating of LMs is not homogeneous. Rather, it is a mixture of single layer and multilayer particles. Whether a single- or multilayer is formed is dependent on the identities of both the core and the coating. Particles with a very high surface contact angle (such as PTFE) tend to form single layers.

page 372

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

(a)

(b)

(c)

(d)

b4205-v2-ch13

373

Figure 13.2. Photographs of 10.0 µL liquid marbles. All examples have a diameter of 3 mm. The powder coatings portrayed are (a) PTFE (grain size: 6–10 µm), (b) PE (grain size: 100 µm), (c) Nickel (grain size: 4–7 µm) and (d) a Nickel-PE hybrid.

Conversely, less hydrophobic particles (such as the PE or nickel) tend to form a multilayer. It can be possible to convert a multilayer LM into a single layer, by repeated rolling. The excess particles fall off the LM, leaving a single layer. There are often gaps in the particle coating, where the surface of the core is exposed to the atmosphere and therefore visible. As (perhaps) anticipated, this is more common in single layered LMs. As a LM ages however, it looses some of its aqueous core to evaporation, which results in a slight contraction of the shell. This has the effect of closing these exposed sections. One would intuitively anticipate

page 373

August 4, 2021

374

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

that this would, in turn, reduce the evaporation rate of the LM, however this is not the case.6 The reasons behind this are still unclear, although a possibility is that the shrinking in pore size causes an increase in the capillary effect. This would result in the water core being closer to the surface, and therefore able to evaporate faster. One of the main features of LMs, and one of the reasons for their use in this project, is that they roll with minimal resistance. On a typical surface (e.g., glass), a water droplet will adhere to the surface, causing resistance to its motion. Conversely, if a water droplet is placed on a hydrophobic surface (e.g., a non-stick frying pan), the water droplet will bead up and roll off with ease. A LM is literally coated in a hydrophobic powder, generating a very large contact angle, and therefore rolls with extreme ease. This gives LMs great merit in fields as diverse as glue delivery systems7 and digital microfluidic bioassays.8 LMs are a new, but strong, player in the field of digital microfluidics.9 Microfluidics involves the formation, behavior, and ultimately the control of microliter quantities of fluid. It is a multidisciplinary area, with a large and growing interest in automation and highthroughput screening. Digital microfluidics is when the fluid is in discrete droplets, as oppose to in a continuous flow. There are a number of ways to manipulate these droplets, the most common of which are magnetic,10 electrowetting on dielectric (EWOD),11 and surface acoustic wave (SAW).12 All of these techniques, however, require both a pre-treated surface and electricity. By encapsulating the droplet in a particle coating and forming a LM, there is the potential to remove both of these limitations. 13.2. Collision-based Gate In interaction gates, Boolean values of inputs and outputs are represented by presence of physical objects at given site at a given time. If an object is present at input/output we assume that logical value of the input/output is True; if the object is absent the logical value is False. The signal-object realize a logical function when they pass through a collision site. The objects might fuse, annihilate or deflect on impact.

page 374

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

(a)

(c)

b4205-v2-ch13

375

(b)

(d)

Figure 13.3. Interaction-based gates. (a) Fusion of signals. (b) Annihilation of signals. (c) Elastic deflection of signals. (d) Non-elastic deflection of signals.

The fusion gate (Figure 13.3(a)) was first implemented in fluidic devices in the 1960s. The gate is the most well known (on a par with the bistable amplifier) device in fluidics13, 14 : two nozzles are placed at right angles to each other, when there are jet flows in both nozzles they collide and merge into a single jet entering the central outlet. If the jet flow is present only in one of the input nozzles it goes into the vent. The central outlet represents and and the vent represent and-not. The fusion-based gate was also employed in designs of computing circuits in Belousov–Zhabotinsky (BZ) medium,15–18 where excitation wave-fragments merge when collide; in the actions of slime mould,19 when distributing vesicles collide; and a crab-based gate,20 were swarms of solider crabs merge into a single swarm. In the annihilation gate (Figure 13.3(b)) signals disappear on impact. This gates has two-inputs and two-outputs, each of the outputs represents and-not. Computational universality of the Conway’s Game of Life cellular automata was demonstrated using annihilation based collisions between gliders.21 We can also implement the

page 375

August 4, 2021

376

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

annihilation gate by colliding excitation wave-fragments at certain angles.22 Key deficiency of the fusion and annihilation gates is that, when implemented in media other than excitable spatially-extended systems, they do not preserve physical quantity of signals, for example, when two signals merge the output signal will have a double mass of a signal input signal. This deficiency is overcome in the conservative logic, proposed by Fredkin and Toffoli in 1978.23 The logical value are represented by solid elastic bodies, aka billiard balls, which deflect when made to collide with one another (Figure 13.3(c)). Intact output trajectories of the balls represent and-not function, output trajectories of deflected balls represent and function. The gates based on elastic collision led to development of a reversible (both logically and physically) gate: Fredkin23 and Toffoli24 gates, which are the key elements of low-power computing circuits,25, 26 and amongst the key components of quantum27–30 and optical31 computing circuits. The soft-sphere collision gate proposed by Margolus32 gives us a rather realistic representation of interaction gates with real-life physical and biological bodies (Figure 13.3(d)). Logical value x = 1 is given by a ball presented in input trajectory marked x, and x = 0 by the absence of the ball in the input trajectory x; the same applies to y = 1 and y = 0, respectively. When the two balls, approaching the collision gate along paths x and y collide, they compress but then spring back and reflect. As a result, the balls come out along the paths marked xy. If only one ball approaches the gate, that is for inputs x = 1 and y = 0 or x = 0 and y = 1, the balls exit the gate via path xy (for input x = 1 and y = 0) or xy (for input x = 0 and y = 1). Soft-sphere-like gates have been implemented using microlitre sized water droplets on a superhydrophobic copper surface.33 Using channels cut into the surface, not-fanout, and-or and flip-flop gates were demonstrated. The water droplets only rebounded in a very narrow collision property window, and a thorough and complete superhydrophobic surface treatment was required. We first developed a technique for the regular and automatic formation of invariable LMs. This was achieved by programming a

page 376

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

377

syringe driver (CareFusion Alaris GH) to feed a 21 gauge needle (0.8 mm diameter) at a typical rate of 7.0 mL h−1 . The rate can be easily increased or decreased, and this rate gave sufficiently fast LM formation for our purpose. The produced droplets (11.60(16)μL) were permitted to fall onto a sheet of acrylic, slanted to 20◦ from horizontal, and surface-treated with a commercial hydrophobic spray R R NeverWet ). This formed beads of water, which were (Rust-Oleum allowed to roll over a bed of appropriate hydrophobic powder. The result was a continuous “stream” of LMs with the same volume, coating and coating thickness. It should be noted that whilst the forming of LMs by running droplets down a powder slope has been separately developed by another group,34 our system prevents premature destruction of the powder bed by initially preforming the droplet on a treated hydrophobic surface. In collision based computing, accurate timing is essential. As signals propagate through the system they must remain in sync, or the operation of many logic gates fails. In order to address this, an innovative system of electromagnets (EMs) was implemented. This was possible due to the generation of novel LMs with a mix of ultra-high density polyethylene (UHDPE) (Sigma-Aldrich, 3–6 × 106 g mol−1 , grain size approximately 100 μm) and nickel (GoodFellow Metals, 99.8%, grain size 4–7 μm). A typical Ni/UHDPE coating was 2.5 mg. The use of UHDPE provides strength and durability, and the inclusion of ferromagnetic nickel allows for a versatile magnetic LM. By positioning an electromagnet (100 N, 12.0 V DC, 29 × 22 mm) behind the acrylic slope, the rolling LM can be captured and released at will, by the switching on and off of said electromagnet. By controlling multiple, spatially-isolated electromagnets in series, nonconcurrent LMs can be easily synchronized. To our knowledge, this is the first time electromagnets have been used to provide timing control with liquid marbles. The collision gate was designed to allow for the colliding LMs to have a free path post-collision. This enabled the monitoring of the LM paths, and the future design and implementation of exiting pathways, creating a logic gate.

page 377

August 4, 2021

378

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

Two 16.0 cm acrylic pathways were slanted towards each other at 20◦ , affixed to an acrylic base sheet (3.0 mm thick). The acrylic base sheet was then aligned with a pitch of 38◦ from horizontal, giving a final LM pathway slope of 16◦ from the horizontal plane. This gave reliable LM rolling without extreme angles. The gap between the two slanted pathways was set at 1.6 cm, after empirical testing. A 2.0 cm, length at the top of each pathway was made hydrophobic, as discussed above. Parallel auto-formation of hybrid LMs was achieved using the syringe driver, delivering 11.6 μL of water per syringe per drop. Each droplet of water was permitted to land on the treated section of each slope, before rolling across the powder beds of UHDPE and Ni to form LMs. The two rolling LMs were then captured using the electromagnets, allowing for any slight timing deviations to be accounted for. On controlled synchronous (or asynchronous) release of the electromagnets, the LMs simultaneously roll off the acrylic ramps on collision trajectories. Collisions were recorded at 120 fps using a Nikon Coolpix P900, and played back frame-by-frame for analysis. A schematic of our LM collider can be seen in Figure 13.4, and photographs of our LM collider can be seen in Figure 13.5. If the gate timings are not accurate, then the LMs will continue on their separate paths and shall not collide. If the timings are

Figure 13.4. Schematic of our LM collider. Labeled numbers are: (1) syringe needle, (2) uncoated water droplet, (3) acrylic ramp, (4) hydrophobic powder bed, (5) liquid marble, and (6) electromagnet. Droplets form and fall out of the two syringe needles, landing on a superhydrophobic surface. They then roll over a bed of Ni/UHDPE powder, before being stopped and held stationary by the electromagnets. These electromagnets are then deactivated simultaneously, allowing the LMs to roll off and collide.

page 378

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

379

(a)

(b)

(c)

Figure 13.5. Photographs of our LM collider, showing (a) the overall layout, (b) an upclose of the magnetic breaking/release area, and (c) an upclose of the droplet formation area. Labels are: (1) syringe needle, (2) acrylic ramp, (3) hydrophobic powder bed, and (4) electromagnet location. The electromagnet is positioned at the back, out of view in these photos. All scale bars are 30 mm.

accurate, and it is taken that the two LMs are identical, then there are three possible outcomes of the collision. Firstly, that the LMs collide with some elastic property, and then continue on two distinct and new paths. Secondly, that the LMs collide with no elastic property, then continue vertically as two adjacent, but distinct, LMs. Thirdly, that on collision the LMs coalesce into a larger single LM with zero lateral velocity, which continues vertically down. Video snapshots showing both a single uninterrupted and a colliding pair of LMs, can be seen in Figure 13.6. The time between the LMs leaving the ramp, colliding and being visually separated again, is 75 ms. This represents an upper limit on LM collision-based computing time for a 1-bit calculation. Analysis from ten collisions show only a slight deviation in the collision exit trajectory. By taking a vertical line from the center of the collision as the reference line, the exit trajectories for the LMs are −5.3(15)◦ and +5.5(11)◦ .

page 379

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

380

(a)

(b)

Figure 13.6. Overlaid still frames of (a) a single LM, with frames at 0 ms, 142 ms, 209 ms, 242 ms, 267 ms; and (b) two colliding LMs, with frames at 0 ms, 125 ms, 200 ms, 217 ms, 225 ms, 250 ms and 275 ms. Both scale bars are 20 mm.

Our experiments demonstrate that LMs collide in an elastic manner. This is unsurprising, due to their previously reported softshell and compressible nature.1, 35 It also supports the previously published, linear, non-coalescing collision of LMs.35 The elastic properties and compressible nature of LMs has been discussed elsewhere in a recent review.36 By monitoring the collisions at 120 fps, it was observed that LMs behave like two soft balls, acting in a manner described in the Soft Sphere Model (SSM), known as a Margolus gate.32 A video of a typical collision can be seen in the supporting information. The important distinction between the SSM and the better known Billiard Ball Model (BBM),23 is the exit points of the colliding particles compared to the non-colliding particles. In the BBM, the particles are taken to be hard spheres, which instantly rebound off each other — leading to the AB paths being outside the ¯ and AB ¯ paths. In contrast, the SSM accounts corresponding AB for the finite and appreciable amount of time required for real-world soft spheres to rebound. The result is that the AB paths move to ¯ and AB ¯ paths. The BBM and SSM lie inside the unchanged AB pathways can be seen in Figures 13.7(a) and 13.7(b), respectively. It was possible to break the SSM analogy by increasing the speed of the LMs. The speed of the LMs was calculated by measuring the distance travelled by the LM in a certain number of frames, and knowing the recording frames per second. When the collision happens at 0.21 m s−1 , the LMs bounce elastically following SSM paths. However, when the speed of collision is increased to 0.29 m s−1 ,

page 380

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

(a)

(b)

b4205-v2-ch13

381

(c)

Figure 13.7. Showing the colliding and non-colliding routes for (a) BBM and (b) SSM pathways. (c) A collision of steel balls, following BBM under gravity (cf. SSM LMs above). Frame times are 0 ms, 175 ms, 242 ms, 284 ms, and 325 ms. Scale bar is 20 mm.

Figure 13.8. Overlaid still frames showing the coalescence of two colliding LMs. Frames shown at 0 ms, 117 ms, 159 ms, 184 ms, 209 ms, 234 ms, and 250 ms. Scale bar is 20 mm.

the two LMs coalesce. This can be seen in the video snapshots in Figure 13.8. It should be noted that the previously mentioned LM computing time is reduced to 50 ms, when the LMs are permitted to coalesce. This is measured from when the LMs leave the ramp, to when they are fully coalesced. The time is saved by not requiring the LMs to separate. Growth by coalescence of LMs is a commonly observed effect.37 For a computing device, the physical nature of the input and output signals should be exactly the same. When two LMs coalesce, the output mass is double a single input mass. Consequently, if colliding LMs at this higher speed, a splitting device would be required to

page 381

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

382

(a)

(d)

(b)

(e)

(c)

(f)

Figure 13.9. (a) Overlaid frames showing the successful reflection of a LM, frames are timed at 0 ms, 234 ms, 334 ms, 375 ms, 400 ms, 434 ms, 476 ms and 517 ms. Scale bar is 20 mm. (b) and (c) The outcomes of a single unreflected LM passing through the adder. (d) The outcome of the adder when two LMs collide and coalesce. (e) The outcome of the adder when two LMs collide according to the SSM. (f) The electronic representation of a 1-bit half-adder.

reduce the mass of the output LM. The facile splitting of a LM using a superhydrophobic treated scalpel has previously been reported.38 By analysing the output paths of the LM collider, it becomes apparent that the gate could be modified to act as a 1-bit half-adder, with the possible outcomes demonstrated in Figure 13.9. When a single LM traverses the system from the A or B channel, it finishes ¯ or AB ¯ path, respectively. Once at the left or right extremes, the AB the exit pathways are combined, this is analogous to the sum output on a half-adder. An initial trial confirming feasibility of this is shown in Figure 13.9(a), where a single LM enters from the right channel, crosses the gap, and is reflected to exit on the right side. When two synchronized LMs pass through the collider, they either rebound or coalesce, according to their velocity at impact. If the LMs coalesce, then the new LM travels straight down the only AB path, which can be considered to be the carry output.

page 382

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

383

Alternatively if the LMs rebound, as in the SSM, then there are two AB paths. One of these paths is then considered to be the carry output, and the other is discarded (the choice between the two AB paths is arbitrary in this case). As stated above, there are approximately 10◦ of departure between the two AB paths, making separation facile. By using this design (complete with magnetic timing control) and an intuitive xor gate, we can adapt the model of the one-bit full-adder proposed originally for BZ medium.39 The xor gate can be replaced with an or gate without loss of logic, as there is no situation where two LMs will arrive simultaneously. The design schematic can be seen in Figure 13.10. There are two sets of electromagnets, which cycle on-off in pairs; first EM1 releases, then shortly after EM2 releases. This maintains synchronization between LMs across the two collision gates. The signal delays (indicated in Figures 13.10 and 13.11 by wavy lines) are useful for helping timing control for any future additional cascades. These can be implemented using channel curves, reflectors, or with EMs. The EM’s delay could be controlled using capacitance. Additionally, the Cin channel could be released simultaneously with the A and B channel, using signal delay to ensure collision. For this design iteration, we have used channels for the passage of LMs. This is a deviation from pure collision-based computing, where

Figure 13.10. LMs.

The general design schematic for a 1-bit full-adder, operated using

page 383

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

384

(a)

(b)

(c)

(d)

Figure 13.11. Example operations of the 1-bit full-adder to sum (a) 1 + 1 + 0 = 10, (b) 0 + 1 + 1 = 10, (c) 0 + 1 + 0 = 01, and (d) 1 + 1 + 1 = 11.

free-space is used as momentary ‘wires’ on an ad hoc basis. However, in this case, we believe the use of channels to be an important intermediate step towards this goal. For the example operations visualized in Figure 13.11, if a LM travels down the A and B channel, then they will collide and travel straight to the carry output. If a LM travels down the B and Cin paths, then the B LM crosses the first gate, before colliding with the Cin LM and travelling straight down to join the carry output. If a single LM travels down the B path, it will cross the first and second gate, finishing on the sum output. If a LM travels down the A, B and Cin paths, then A and B will collide at the first gate and go straight to the carry output, whilst the Cin LM will cross its gate and finish on the sum output. Based on the observations of our collision gate, we note that such a full-adder could be implemented with dimensions of approximately 15 × 15 cm, using LMs with a volume of approximately 10 μL, and a LM pre-collision run length of 2 cm. This gate could then be cascaded as required to produce an n-bit full-adder. Signal timing across multiple cascading gates would be controlled using multiple stage

page 384

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

385

EMs. Empirical testing has shown that reliable and manipulable LMs can be formed down to 1.0 μL, meaning that the device could then be scaled down appropriately. Using a single set of syringes, the automatic marble maker can up to eight LMs per needle per second. At this speed, synchronization of the electromagnets becomes crucial. Initial investigations into the collision lifetime of the LM are promising, with six 10.0 μL LMs confined to a 2.5 × 2.5 cm space on an orbital shaker at 100 rpm (covered to minimize evaporation), showing no signs of wear after three hours.

13.3. Belousov–Zhabotinsky Cargo A non-stirred BZ medium40, 41 exhibits a rich spectrum of oxidation wave-front dynamics. In late 1980s Kuhnert, Agladze and Krinsky42, 43 experimentally demonstrated implementation of image processing and memory in BZ media, where data was input optically and the results of the computation were represented by the patterns of the oxidation wave-fronts. Since then, a large variety of unconventional computing devices have been implemented using this BZ medium. These include diodes,44 logical gates,45 robot controllers,46, 47 counters,48 neuromorphic architectures49–52 and arithmetical circuits.53, 54 Methods for optical, geometrical, or chemical control of oscillation frequency have been employed to prototype logical gates,55 modulators,56 filters,57 memory,58 fuzzy logic,50 and oscillatory associated memory.59 While most of BZ computing devices use the presence of a wavefront in a selected locus of space as a manifestation of logical True, there is a body of works on information coding with frequencies of oscillations. Thus, Gorecki et al.55 proposed to encode True as high frequency and False as low frequency: or gates, not gates and a diode have been realized in numerical models. Other results in BZ frequency based information processing include frequency transformation with a passive barrier,56 frequency band filter,57 and memory.58 Using frequencies is in line with current developments in oscillatory logic,60 fuzzy logic,50 oscillatory associated memory,59 and computing in arrays of coupled oscillators.61, 62

page 385

August 4, 2021

386

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

In order to prototype a fully fledged computer via BZ medium, we need to encapsulate and isolate parts of the medium, for example, following a theoretical ideology of membrane computing, or P-systems, introduced by P˘aun in 1998.63–66 Encapsulation can be achieved by immersing BZ droplets in oil and allowing pores to form between the droplets, thus permitting the exchange of information in a form of travelling oxidation wave-fronts. This has been demonstrated in Refs. [67–74]. This approach however has its limitations, because the liquid “mother” solution (in which the BZ droplets are dispersed) requires special handling. An ideal approach would be to make BZ droplets “dry”, yet being able to produce controllable oscillation dynamics. As a result, the required alternative solution comes in the form of liquid marbles. To evaluate behavior of excitation in BZ liquid marbles we conducted a series of experiments. The BZ reaction (ferroin-catalyzed/malonic acid BZ reaction) studied was prepared using the method reported by Field,75 omitting the surfactant Triton X. 18 M Sulphuric acid H2 SO2 (Fischer Scientific, CAS 7664-93-9), sodium bromate NaBrO3 (Sigma Aldrich, CAS 7789-38-0), malonic acid CH2 (COOH)2 (Sigma Aldrich, CAS 141-82-2), sodium bromide NaBr (Sigma Aldrich, CAS 7647-156), and 0.025 M tris-(1,10-phenanthroline) iron(II) sulphate (ferroin indicator, Sigma Aldrich, Honeywell Fluka, CAS 14634-91-4) were used as received. Coatings for LMs, ultra high density polyethylene (PE) (Sigma Aldrich, CAS 9002-88-4, Product Code 1002018483), and polytetrafluoroethylene (PTFE) (Alfa Aesar, CAS 9002-84-0, Product Code 44184) were used as received, with particle sizes 100 μm and 6–10 μm, respectively. H2 SO4 (2 mL) was added to deionized water (67 mL), to produce 0.5 m H2 SO4 , NaBrO3 (5 g) was added to the this to yield 70 mL stock solution, containing 0.48 m NaBrO3 . Stock solutions of 1 m malonic acid and 1 m NaBr were prepared by dissolving 1 g in 10 mL of deionized water. In a 50-mL beaker, 0.5 mL of 1 M malonic acid was added to 3 mL of the acidic NaBrO3 solution. 0.25 mL of 1 M NaBr was then added to the beaker, which produced bromine. The reaction was left, until

page 386

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

387

a clear colorless solution remained (ca. 5 min) before adding 0.5 mL of 0.025 M ferroin indicator to the beaker. BZ LMs were prepared by pipetting droplets of BZ reaction mixture (50 and 100 μL), which was already oscillating, onto a powder bed of either polyethylene (PE) or polytetrafluoroethylene (PTFE) in a weighing boat, releasing the droplet ca. 5 mm from the top of the powder bed. The BZ droplet was rolled on the powder bed for 10 s to produce a LM. The single BZ LMs were then transferred onto a cool white LED housed in a black plastic box (single 5 mm diameter cool white LED 5000–8300 K, powered by a standard 9V battery) to highlight the oscillating reaction inside the marble and enable the observation of travelling wave-fronts through the LM coating. Disordered arrays of LMs were prepared by transferring the premade LMs into a Petri dish. Ordered LM arrays were prepared by rolling LMs onto a 4 × 4 plastic polypropylene template. Both BZ arrays were then illuminated by using an LED light underneath the Petri dish. The BZ reaction in LMs was recorded using a USB microscope with magnification ×5. Initially, to investigate the feasibility of preparing stable LMs using the acidic BZ media, powder coatings of PE and PTFE were studied. The rationales behind using PE and PTFE were these polymers have previously been demonstrated as suitable coatings for LMs, are relatively inert to acidic solutions, provide a comparison between LM coating particle sizes, and appear translucent when illuminated with an LED.6 The latter meant it was possible to observe the oxidation waves through both PE and PTFE coatings. The difference in particle size of the coatings varied the particle spacing on the surface of the LM, which therefore varied, to different degrees, the observations of the oscillating BZ solution within the LM. Studies focused on observing the color changes and oxidation waves of the BZ media in single LMs and then observing the behavior of oxidation waves in disordered and ordered arrays of LMs. The concentration of BZ solution used to make the LMs exhibited oscillating behavior. When the BZ solution prepared was left unstirred in a thin film in a Petri dish, pattern formation occurred. The full videos of all single LMs can be found in the ESI.

page 387

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

388

(a) 1 s

(b) 691 s

(c) 697 s

(d) 703 s

(e) 709 s

(f) 716 s

Figure 13.12. A single 50 µL PTFE-coated BZ LM showing the LM at the start of the recording (a) and showing the 1st oxidation wave observed (b)–(f). The times images were taken are indicated in the captions for each sub-figure.

Figure 13.12 shows the propagation of the 1st travelling wave observed in a single 50 μL PTFE-coated BZ LM. For the wave to propagate across the 1st and 2nd half of the LM, it took on average 12 ± 2 s and 17 ± 4 s, respectively, with a full oscillation taking on average 29 ± 5 s. 10 travelling waves were visible. Some centralized oscillations occurred at the start of the experiment, shown in Figure 13.12(b). Travelling waves did not appear until ca. 11 mins after the start of the experiment, propagating from right to left across the LM, shown in Figures 13.12(b)–13.12(f). The LM started buckling after only ca. 2 min after marble positioning. In Figure 13.12(e) it appears the travelling wave splits into two wavefronts, potentially arising from the buckling of the LM coating. Some gas evolution can be observed, shown by the trapped bubble in Figures 13.12(b)–13.12(f). Full oxidation of the ferroin to ferriin had occurred within the marble after ca. 25 min. Figure 13.13 shows the propagation of the only travelling wave observed in a single 100 μL PTFE-coated BZ LM. For the wave to propagate across the 1st and 2nd half of the LM, it took 19 s and 21 s,

page 388

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

Sensing and Computing with Liquid Marbles

389

(a) 840 s

(b) 850 s

(c) 860 s

(d) 865 s

(e) 870 s

(f) 875 s

Figure 13.13. A single 100 µL PTFE-coated BZ LM showing the LM at the start of the recording (a) and showing the only wave observed (b)–(f). The times images were taken are indicated in the captions for each sub-figure.

respectively, with the full oscillation taking 40 s. Only one oscillation was observed, due to taking significantly longer to prepare a viable PTFE-coated BZ LM to record. Therefore, due to this and the fast onset of buckling, no further experiments were performed on PTFEcoated BZ LMs. Figure 13.14 shows the 1st and 2nd travelling waves observed in a single 50 μL PE-coated BZ LM. For the wave to propagate across the 1st and 2nd half of the LM, it took on average 10 ± 2 s for both, with a full oscillation taking on average 20 ± 3 s to travel across the diameter of the LM. 55 single travelling waves were visible, which moved across the LM from top to bottom. After ca. 2 min small movements of the LM was observed. However, it was hard to judge whether this was a result of variations in inter-facial tension due to the travelling waves, previously observed in BZ droplets in oil76–78 or simply the movement of the powder coating due to evaporation of the encapsulated BZ solution. After ca. 18 min, buckling of the 50 μL PE-coated BZ LM was observed. After ca. 34 min some gas evolution CO2 and/or CO was observed under the coating, however the bubbles

page 389

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

390

(a) 312 s

(b) 324 s

(c) 337 s

(d) 526 s

(e) 538 s

(f) 552 s

Figure 13.14. A single 50 µL PE-coated BZ LM showing the 1st (a)–(c) and 2nd (d)–(f) oxidation waves. The times images were taken are indicated in the captions for each sub-figure.

did not affect the travelling waves as most had been observed by this time. Multiple oscillations occurred after ca. 36 min, at which time significant buckling of the marble had occurred. Full oxidation of the ferroin to ferriin had occurred within the marble after ca. 55 min. In a repeat experiment, for the wave to propagate across the 1st and 2nd half of the LM, it took on average 14 ± 1 s and 12 ± 2 s, respectively, with a full oscillation taking on average 25 ± 2 s. 32 single travelling waves were visible, slightly less than the previous 50 μL LM, attributed to the preparation and setup time of the LM underneath the camera. The travelling waves in this LM were observed to move across the LM from right to left. After ca. 19 min, buckling was observed, the same as the previous 50 μL LM. After ca. 39 min again some gas evolution occurred. After ca. 45 min the travelling waves were not easy to distinguish and multiple oscillations started occurring. Full oxidation of the ferroin to ferriin had occurred within the LM after ca. 49 min. After this length of time significant size gas bubbles can be observed trapped under the buckled coating of the LM.

page 390

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

Sensing and Computing with Liquid Marbles

391

(a) 597 s

(b) 609 s

(c) 632 s

(d) 1268 s

(e) 1286 s

(f) 1309 s

(g) 1900 s

(h) 1911 s

(i) 1920 s

(j) 1922 s

(k) 1925 s

(l) 1933 s

Figure 13.15. A single 100 µL PE-coated BZ LM coated showing every oxidation wave observed. The times images were taken are indicated in the captions for each sub-figure.

Figure 13.15 shows the travelling waves in a single 100 μL PE-coated BZ LM. For the wave to move across the 1st and 2nd half of the LM, it took on average 18 ± 6 s and 22 ± 1 s, respectively, with a full oscillation taking on average 40 ± 6 s. 5 single travelling waves

page 391

August 4, 2021

392

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

were visible (at 1897s, 3 waves were observed at one time, shown in Figures 13.15(j)–13.15(l)), travelling from right to left across the LM. Gas evolution occurred after ca. 25 min, slightly earlier than observed for 50 μL LMs. Significant buckling of the LM occurred after ca. 30 min. Full oxidation of the ferroin to ferriin has occured after ca. 42 min. In a repeat experiment, for the wave to propagate across the 1st and 2nd half of the LM, it took on average 16 ± 5 s and 19 ± 4 s, respectively, with a full oscillation taking on average 35 ± 7 s. 6 single travelling waves were visible, propagating from the bottom to the top of the LM. After ca. 19 min buckling of the 100 μL PEcoated BZ LM was observed. Gas evolution occurred after ca. 25 min, the same as the previous 100 μL single BZ LM analyzed. Multiple oscillations occurred after ca. 21 min. Full oxidation of the ferroin to ferriin had occurred within the marble after ca. 42 min. The smaller volume single LMs exhibited more visible oscillations than the larger volume LMs. Small vibrations, impacts, and collisions caused both PE BZ LMs and PTFE BZ LMs to coalesce or burst relatively easily. PE-coated LMs were easier to roll into position over either the LED to record a single LM or position into disordered and ordered arrays. The single BZ LM experiments proved which out of the two coatings selected would be a viable coating to encapsulate the BZ media in. Disordered and ordered arrays of PE-coated BZ LMs were prepared to observe whether transfers of oxidation waves occurred between LMs in close proximity and to report propagation pathways within these different arrays. PE-coated BZ LMs for the various arrays were prepared using 50 μL and 100 μL droplets of already oscillating BZ media. For disordered arrays of BZ LMs, a number of LMs of the same volume were rolled into a Petri dish. For the 50 μL and 100 μL BZ LM disordered arrays, the number of LMs used was 14 and 15, respectively. Unfortunately it proved difficult to completely fill the Petri dish, due to stability of the LMs. To discuss the transfer of waves and propagation pathways, BZ LMs in the disordered arrays were numbered, shown in Figures 13.16 and 13.17 for 50 μL and 100 μL disordered arrays, respectively.

page 392

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

393

Figure 13.16. 50 µL PE-coated BZ LM disordered array — top numbers (in black) refer to the numbers assigned to the LMs in the array, bottom numbers (in white) refer to the number of individual oscillations observed in each LM.

Figure 13.17. 100 µL PE-coated BZ LM disordered array — top numbers (in black) refer to the number assigned to the LMs in the array, bottom numbers (in white) refer to the number of individual oscillations observed in each LM.

page 393

August 4, 2021

394

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

84 individual oscillations were observed in the 50 μL disordered array, with 14 of these waves resulting in transfer from one LM to another. Therefore, 17% of oscillations resulted in transfers from one LM to another. All the LMs in the 50 μL disordered array oscillated, LM4 oscillated the most with 10 visible wave-fronts, whilst LM3 oscillated the least with four visible wave-fronts. The number of individual oscillations each LM exhibited are reported in Figure 13.16. Out of the 14 waves observed to transfer, 2 of these waves from 2 different LMs appeared to result in transfer to a single LM. The longest propagation pathway observed occurs between LM2–LM4–LM5 occurring from the 3rd and 4th wave transfers. For the 100 μL disordered array, 15 LMs were used, shown in Figure 13.17. 153 oscillations were observed, 31 of which resulted in transfers from one marble to another. Therefore, 20% of oscillations resulted in transfers from one marble to another, similar to the percentage observed for the 50 μL disordered array. The number of individual oscillations each marble exhibited are reported in Figure 13.17. The video for the BZ 100 μL disordered array can be found in the ESI. The longest propagation pathway observed occurs between 3 LM marble in the 100 μL disordered array, occurring from the 9th and 10th wave transfers. As can be seen from 50 μL disordered array transfer videos in the ESI and the 1st transfer from the 50 μL disordered array shown as an example in Figure 13.18, the direction of oxidation wave transfer between LMs demonstrates that excitation passes through the LM disordered array rather than being a result of spontaneous selfoscillation of the media inside a single LM. Similar wave transfers have been previously observed in BZ vesicles.79 Ordered arrays of 50 μL BZ LMs were prepared by using a polypropylene template to position the marbles in a 4 × 4 arrangement. This allowed more control over the number of contacts each LM had with adjacent LMs. Figure 13.19 shows a 16 LM ordered array recorded, in which 264 oscillations were observed. All LMs turned to the fully oxidized state after ca. 1 h 20 min. Fewer transfers occurred in the ordered array cf. the disordered array, with only six inter-marble transfers occurring, meaning only 2% of oscillations

page 394

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

(a) 820s

(b) 825s

(c) 830s

(d) 835s

(e) 840s

(f) 845s

b4205-v2-ch13

395

Figure 13.18. 1st transfer observed in 50 µL PE-coated BZ LM disordered array, transferring from LM7 to LM10.

Figure 13.19. 50 µL PE-coated BZ LM ordered array — top numbers (in grey) refer to the numbers assigned to the LMs in the array, bottom numbers (in white) refer to the number of individual oscillations observed in each LM.

page 395

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

396

b4205-v2-ch13

A. Adamatzky et al.

transferred to an adjacent marble. The longest pathway observed involved three marbles (occurring during the 1st and 2nd transfers). In another 4 × 4 ordered array, 182 oscillations were observed. All LMs turned to the fully oxidized state in this array after ca. 1 h.

13.4. Photosensor A key limitation for using BZ LMs as computing elements is the lack of definite optical indication of the oxidation wave fronts, especially if the coating is non-transparent. To overcome this limitation, we proposed to measure oscillations of electrical potential, caused by oxidation wave-fronts by inserting electrodes in the BZ mediumfilled LMs. We previously made a prototype BZ marble thermal sensor,80 Section 13.5, and demonstrated that it is possible to reduce the frequency of oscillations or even completely halt oscillations by “freezing” them. This method is robust, but requires Peltier elements and, therefore, lacks the scalability demanded from a computational viewpoint. An alternative means for controlling oscillations in BZ marbles is illumination with visible light, as the BZ medium is proven to be a light-sensitive chemical system due to the photochemical properties of the catalyst.43, 81–85 Based on this rationale, we decided to explore whether it is possible to prototype a BZ LM photosensor, which would react to illumination by the consequent changes in its oscillatory patterns of electrical potential. Our laboratory experiments and computer modelling conclusively demonstrated that BZ LMs are fully functional photosensors capable of responding to a series of optical stimulation. The LMs in this report were all formed using the BZ reaction medium as the aqueous core and ultra-high density polyethylene (PE) (Sigma-Aldrich, 3–6 × 106 g mol−1 , grain size approximately 100 μm) as the powder coating. LMs were produced by hand-rolling droplets of the pre-prepared BZ medium (62 μL) on a powder bed of PE, until a uniformly coated LM was generated. The BZ reaction medium was prepared using a modified version of a literature method, omitting the surfactant Triton-X.75 Sodium bromide, malonic acid, sodium bromate, and ferroin were sourced

page 396

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

397

from Sigma-Aldrich. Sulphuric acid was sourced from Fischer Scientific. All reagents were used as received, without any further purification. Sodium bromate (5.0 g) was added to sulphuric acid (0.5 m, 69 mL) with stirring. An aliquot of this solution was taken (3.0 mL), and the remainder stored for future use. To the aliquot was added malonic acid (1.0 m, 0.5 mL) and sodium bromide (1.0 m, 0.25 mL), resulting in a transiently orange solution caused by the emission of bromine gas. Once the reaction mixture became colorless once more, ferroin was added (0.025 m, 0.5 mL, Sigma-Aldrich product 318922, see Ref. [75]), yielding approximately 4.25 mL of the BZ reaction medium. The produced BZ LMs were set on a Petri dish, where they were fixed in place by being punctured with two iridium-coated stainless steel sub-dermal needles with twisted cables (Spes Medica SRL, Italy), for use as electrodes. The electrical output of the LM was logged with a PicoLog ADC-24 high resolution data logger (Pico Technology Ltd, UK). The BZ LMs were stimulated with a cold light source (PL2000, Photonic Optics, USA) 3250 K, 18 MLux, for 300 s in average over experiments. A scheme of the experimental setup is shown in Figure 13.20. Average periods of oscillations for 19 marbles, recorded before stimulation with light, together with standard deviations (σ) for the respective LMs, are shown in Figure 13.21(a). Each dot corresponds

C V

D

B A

(a)

(b)

(c)

Figure 13.20. Experimental setup. (a) A scheme of the experimental setup: A — BZ marble, B — a pair of electrodes, C — Pico ADC-24 logger, D — light source. (b) A photo of BZ marble with electrodes insert. (c) A photo of BZ marble illuminated (intensity of illumination is lower than in experiments, to prevent photo overexposure).

page 397

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

A. Adamatzky et al.

398 60

14

50

12 *

Standard deviation, sec

Standard deviation, sec

b4205-v2-ch13

40 30 20 10

10 8 6 4 2 0

0 50

100

150

200

250

20

300

30

40

50

60

70

Average period of oscillation, sec

Average period of oscillation, sec

(a)

(b)

50 40

40 30

30 20

Potential, mV

Potential, mV

20 10 0 10

10 0 10 20

20

30

30

40

40 5

105

10

105

2 105

3 105

4 105

Time, ms

Time, ms

(c)

(d)

5 105

Figure 13.21. Typology of oscillations. (a) Average period of oscillations vs standard deviation for all marbles. Data points labeled with an asterisk are considered to be low-frequency liquid marbles. (b) Zoomed in domain where majority of marbles are observed. Each dot on the plot is derived from experiments with unique marble. (c) and (d) exemplar experimental oscillations of (c) low frequency marbles (shown in (a) by arrows) and (d) high frequency marbles.

page 398

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

399

to a particular LM. Most dots are concentrated in the [Average period of oscillations, Standard deviation] domain [0, 70] s × [0, 15] s. The three dots indicated by arrows and an asterisk in Figure 13.21(a) correspond to periods of oscillations over 240 s. The periods are 247 s (σ = 17), 290 s (σ = 7), and 298 s (σ = 61). An example of low frequency oscillations recorded from a BZ LM is shown in Figure 13.21(c). An excitation wave front in BZ medium has a positively charged head and negatively charged tail. Based on the shape of signals in the exemplar shown in Figure 13.21(c), we can claim that the first spike recorded an event when the oxidation wave-front crossed electrodes in the direction from the recording to the reference electrode (thus we have sharp increase of the potential followed by the decrease). Then, an excitation wave-front crossed electrodes in the direction from the reference to the recording electrode (last three spikes in Figure 13.21(c)). The marbles of the low frequency group reacted to the luminous stimulus as explained in the following (see exemplar potential recorded in Figure 13.22(a)). Two marbles stopped oscillating, and one marble retained its oscillations but reduced its period from 247 s to 56 s. The mean LM oscillation period after the stimulation was removed was 31 s (σ = 9). Response to stimulation was instant in the two LMs that halted their oscillations; while the LM that continued oscillating briefly, did so in a disorganized manner for few minutes before finally halting (Figure 13.22(b)). Average response to halting of stimulation for low-frequency LMs was 66 s (σ = 28). Reusability of the marble photosensor is evidenced in Figure 13.22(b) where the LM halted oscillations nearly instantly at the second round of stimulation. In the majority of cases (16 LMs, 84% of total), periods of oscillation (Figure 13.21(b)) ranged from 17 s to 72 s, with an average of 36 s (σ = 13) and median of 33 s, see example in Figure 13.21(d). In the cases of typical oscillations, phases of the potential change are less pronounced than in low frequency oscillations. This might be due to the location of the excitation source occurring away from the electrodes.

page 399

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

400

ON

20

OFF

Potential, mV

10 0 10 20 30 40 1.0 106

1.5 106

2.0 106

2.5 106

ON

OFF

Time, ms

(a) 40 OFF

ON

Potential, mV

20

0

20

40 0

1 106

2 106 Time, ms

(b)

Figure 13.22. Experimentally recorded low frequency response of BZ marbles to a stimulation with light.

All marbles responded to a light stimulation with nearly instant halting of the oscillations. On switching the illumination off, the LM resumed their oscillatory activity. The LMs can be split into two groups based on changes of the frequency after the stimulus is removed: (Group A) oscillate with decreased frequency poststimulation, six LMs, (Group B) oscillate with increased frequency, eight LMs. In Group A, the average period is changed from 30 s (σ = 9) to 56 s (σ = 25). In Group B, the average period is changed from 45 s (σ = 16) to 27 s (σ = 6). Roughly, we can say that the marbles in the Group A halve their frequencies as a result of stimulation

page 400

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

401

and the marbles in the Group B double their frequencies. The marbles respond to removal of the stimulus with nearly the same delay: Group A 41 s (σ = 19) and Group B 42 s (σ = 30).

13.5. Thermal Sensor Temperature sensitivity of the BZ reaction was initially substantially analyzed by Blandamer and Morris86 who, in 1975, showed a dependence of the frequency of oscillations of a redox potential in a stirred BZ reaction with a change in temperature. Periods of oscillations reported were 190 s at 25◦ C, 70 s at 35◦ C, and 40 s at 45◦ C. In 1988 Vajda et al.87 demonstrated that temporal oscillations of a BZ mixture persist in a frozen aqueous solution at −10◦ C to −15◦ C. By tracing Mn2+ ion signal amplitude they showed that the frozen BZ solutions oscillate 3 times, at −10◦ C, and 11 times, at −15◦ C, faster than liquid phase BZ. The oscillation frequency increase has been explained by formation of crystals and interfacial phenomena during freezing. This might be partly supported by experiments with chlorite-thiosulphate system frozen to −34◦ C.88 There a velocity of wave fronts is increased because en route to total freezing the reaction occurs only in the thin liquid layer, at the periphery of the solid domain, where concentrations of chemicals are temporarily higher. In 2001, Masia et al.89 monitored oscillations in non-stirred BZ in a batch reactor of 4 cm3 by the solution absorbency at 320 nm. The reactor was kept at various temperatures through thermostatic control. They reported periodic oscillation at temperatures 0◦ C to 3◦ C, quasi-periodic at 4◦ C to 6◦ C, and chaotic at 7◦ C to 8◦ C. Bansagi et al.90 experimentally demonstrated that by increasing temperature from 40◦ C to 80◦ C it is possible to obtain oscillations of frequency over 10 Hz; they also showed that the frequency of oscillations grows proportionally to temperature (in the range studied). Ito et al.91 reported linear dependence of an oscillation period — of polymers impregnated with BZ — from temperature in the range 5◦ C to 25◦ C. BZ LMs were produced as described in Section 13.4. A scheme of experimental setup is shown in Figure 13.23(a). A LM was placed in Petri dish (35 mm diameter) and pierced

page 401

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

402 I

C

V

V B

E

D

F

Temperature, oC

H

*

20

A

10

0 0

G

(a)

#

5,000 10,000 Time, 102 ms

(b)

Figure 13.23. Experimental setup. (a) A scheme of the setup: A — BZ LM, B — a pair of electrodes, C — Pico ADC-24 logger, D — Peltier element, E — fans, F — power supply for fans, G — power supply for the Peltier element, H — thermocouple, I — TC-08 thermocouple data logger. (b) Dynamics of temperature on the surface of the Petri dish when the Peltier element is powered by 7V. The moment of power on is shown by “*” and switched off, “#”.

with two iridium coated stainless steel electrodes (sub-dermal needle electrodes with twisted cables (SPES MEDICA SRL Via Buccari 21 16153 Genova, Italy). Electrical potential difference between electrodes was recorded with a Pico ADC-24 high resolution data logger (Pico Technology, St Neots, Cambridgeshire, UK), sampling rate 25 ms. A Petri dish with LM was mounted to a Peltier element (100 W, 8.5 A, 20 V, 40 × 40 mm, RS Components Ltd., UK), which in turn was fixed to an aluminium heat sink, with Silver CPU Thermal Compound, cooled by two 12 V fans (powered separately from the Peltier element). Temperature at the Peltier element was controlled via RS PRO Bench Power Supply Digital (RS Components Ltd. Birchington Road, Corby, Northants, NN17 9RS, UK). Temperature at the bottom of the Petri dish was monitored using TC-08 thermocouple data logger (Pico Technology, St Neots, Cambridgeshire, UK), sampling rate 100 ms. A typical cooling rate was −1◦ C per 10 s, and warming rate +1◦ C per 20 s, exact shape of the functions is shown in Figure 13.23(b).

page 402

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

403

Temperatures on the surface of the Petri dish below −2◦ C usually result in a burst LM. Two examples are illustrated in Figure 13.24. An intact LM (Figure 13.24(a)) shows oscillations with average period 26 s (between “A” and “B” in Figure (13.24(c)). When cooling is started (“B” in Figure 13.24(c)) oscillations quickly become low frequency low amplitude irregular, average period 49 s. Eventually the LM burst (“C” in Figure 13.24(c))) and its cargo is relocated away from the electrodes (Figure 13.24(b)). In scenario shown in Figures 13.24(d)–13.24(g) LM undergoes two instances of freezing. First time, marked “B” in Figure 13.24(g) the LM (Figure 13.24(d)) survives being cooled down with just slight change in shape (Figure 13.24(e)). Period of oscillations increases from 28 s in intact LM to 162 s in cooled down LM (period between “B” and “C” in Figure 13.24(g)). After Peltier is switched off (moment “C” in Figure 13.24(g)) the LM resumes high frequency oscillations, frequency 42 s, but with lower amplitude. The LM does not survive second round of freezing (“D” in Figure 13.24(g)) and bursts, whilst still wetting the electrodes (Figure 13.24(f)). More examples of electrical potential dynamics for temperatures causing LM bursting are shown in Figure 13.25. The temperature of −2◦ C is critical, in that over 70% of LMs burst and did not survive second round of freezing. Therefore, in further experiments the LMs were cooled down to −1◦ C. Patterns of oscillations of LM cooled down to −1◦ C show a high degree of polymorphism (Figure 13.26) in amplitudes. Changes in frequencies are in Table 13.1. If we ignore the first example (Figure 13.26(a)) then we have average p = 44.4 (σ(p) = 12.5), ∗ ∗ average p∗ = 92 (σ(p∗ ) = 28.6) average pp = 2.1 (σ( pp ) = 0.5). In the experiments shown in Figures 13.26(a)–13.26(e), LMs were kept cooled until the end of the experiments. In the experiment shown in Figures 13.26(f), cooling was started after 1310 s of the experiment, the Peltier was switched off after 2254 s, and cooling was repeated at 2690 s. Intact LM oscillated with average period 47 s at first phase of the experiment. Cooled LM oscillated with period 74 s. The period became 29 s after the warming. Second cooling increased the period to 67 s. Thus, we have an increase of 1.6 times during

page 403

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

404

(a)

(b)

(c)

(d)

(e)

(f)

C

D

Electrical potential, mV

A

50

B

0

50 5 105

10 105

15 105

Time, ms

(g)

Figure 13.24. (a)–(c) LM bursts at first freezing. (d)–(g) LM burst at second freezing. (a) Marble at the beginning of experiment, (b) Marble burst at some point of freezing. (c) plot of oscillations: A — the marble is stimulated with a silver wire, B — oscillations started, Peltier element is switched on, C — marbles cools downs, eventually the marbles bursts. (d) LM at the beginning of experiment. (e) Cooled down LM, (f) LM bursts and spreads at the second round of freezing. (g) Dynamics of electrical potential: A — marble is stimulated by a silver wire for 2–3 s, B — Peltier element is switched on, C — Peltier is switched off, D — Peltier is switched on again.

page 404

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

Sensing and Computing with Liquid Marbles *

50 Potential, mV

Potential, mV

0

#

20

#

*

0 20 40 60

50 5 105

0

10 105

5 105 Time, ms

(a)

(b) *

*

20

# Temperature, oC

0

20 6 105

10 105

Time, ms

20 Potential, mV

405

8 105

10 105

10

#

0

0

5,000

10,000

Time, 102 ms

Time, ms

(c)

(d)

Figure 13.25. Dynamics of electrical potential of LM cooled, temperature at the bottom of the Petri dish, down to (a) −4◦ C, (c) −3◦ C, (d) −2◦ C. Moment when Peltier element is switched ON is shown by “*” and OFF by “#”. (d) Temperature log corresponding to experiments (a).

Table 13.1. Effect of cooling to −1◦ C on a period of electrical potential oscillations of BZ LM: p is a period of electrical potential oscillation of a LM at ambient temperature, p∗ is a period of electrical potential oscillations of the cooled LM. Plot

p, s

p∗ , s

p∗ p

Figure 13.26(a) Figure 13.26(b) Figure 13.26(c) Figure 13.26(d) Figure 13.26(e) Figure 13.26(f) Figure 13.26(f) Figure 13.26(g)

61 59 56 22 39 47 29 39

336 126 138 67 86 74 67 86

5.5 2.1 2.5 3 2.2 1.6 2.3 2.2

page 405

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

406

40 *

*

#

20 Potential, mV

Potential, mV

50

0

0 20 40

50 1 106

0

60

2 106

1 106

2 106

Time, ms

Time, ms

(a)

(b)

*

20

*

Potential, mV

Potential, mV

40 20 0 20

0 20 40 60

40

80 60 5 105

10 105 Time, ms

15 105

1 106 Time, ms

0

20 105

(c)

(d)

50

50

Potential, mV

Potential, mV

0

50

0

50

* 5 105

!

10 105

*

15 105

1 106

2 106

3 106

Time, ms

Time, ms

(e)

(f)

#

50

* Potential, mV

Potential, mV

*

#

*

50

2 106

0

*

*

#

#

0

50

50 1 106

2 106

3 106

100 0

2 106

Time, ms

Time, ms

(g)

(h)

4 106

Figure 13.26. Dynamics of electrical potential of BZ LM subjected to cooling down to −1◦ C and warming up. Moments when the Peltier element was switched ON are shown by “*” and OFF by “#”. Moments when electrodes are inserted in the LM are shown by “!”.

first cooling cycle and by 2.3 times during the second cooling cycle. In the experiment illustrated in Figure 13.26(g), oscillations were arrested by cooling yet restarted when the LM was warmed. Period of oscillations before cooling was 47 s, and after oscillations restarted

page 406

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

407

after cooling was 39 s. Repeated cooling did not arrest oscillations yet increased the oscillation period 2.2 times to 86 s. In the experiment shown in Figure 13.26(h), we cooled a LM for short periods of time (199 s and 288 s) and did not observe any substantial changes in periods of oscillation, after the first freezing cycle. The average periods were changing as follows 46 s → 92 s → 98 s → 98 s → 98 s. To summarize, average period of oscillations of a BZ LM doubles from 44 s to 92 s when the LM is cooled down to −1◦ C. The frequency of oscillations is restored after cooling is stopped. The amplitude of oscillations may increase or decrease as a result of cooling. Sometimes the oscillations can be completely arrested yet resume after warming.

13.6. Robot Controller The control of a wheeled robot was achieved by utilizing an on-board liquid marble with a cargo of BZ solution.47 The electrical connection of the LM with a conventional electrical circuit was implemented by inserting two stainless steel, iridium coated electrodes to the LM. The observed electrical potential was translated as an input to the control system of the robot and altering its movement. The LM was stimulated with high light intensity, by using a laser beam. As the BZ cargo in the LM reformed its electrical oscillations based on the reaction to light intensity, these changes were picked up by the electrodes and transmitted to the control circuit, where they were translated to motor activations. Initially and in order to unveil the electrical behavior of the oscillations occurring within a BZ LM, one such LM was located on a Petri dish and two iridium coated stainless steel electrodes were inserted to it as illustrated in Figure 13.27(a). As electrodes, c SPES MEDICA SRL Via Buccari 21 16153 sub-dermal needles ( Genova, Italy) with twisted cables were utilized, while the potential difference was recorded with an high resolution data logger (ADC-24 Pico Technology, St Neots, Cambridgeshire, UK) with sampling frequency 100 Hz. Then, a similar configuration with a small plastic holder replacing the Petri dish, was placed on top of a wheeled robot (Figure 13.27(b)),

page 407

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

408

(a)

(d)

(b)

(c)

(e)

Figure 13.27. Photos from experiments. (a) Electrodes inserted to a BZ marble on a Petri dish. (b) The BZ marble placed on top of the robot. (c) Zoomed in to the on-board BZ LM electrodes. (d) Movement of the robot with the BZ marble. (e) Illumination of BZ LM that controls the robot.

while the electrodes were connected to the electrical digital circuit switching on the motors of the robot (Figure 13.27(c)). A common robot was selected, in order to not complicate the experiments, namely a pre-build robot branded Zumo.92 As an Arduino Uno is used to control the movement of the robot, Zumo was built as an Arduino shield to easily be coupled with the Arduino

page 408

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

409

board, as shown in Figure 13.27(d). The illumination of the LM was accomplished with a laser pointer at 5 mW of green light with wavelength 532 nm, as depicted in Figure 13.27(e). The laser pointer was handled by a human operator at a distance of ca. 20 cm from the LM. The selection of the wavelength of 532 nm is approximating the absorption peak at 510 nm of a ferroin-catalyzed BZ medium,84 thus it is considered that will affect the encapsulated reaction and alter the electrical output sensed by the inserted electrodes. Moreover, as the electrical potential realized in the first experiments was of low amplitude and had some negative values that the Arduino Uno board cannot read, an A–D (analogue-to-digital) converter was implemented in the design (namely ADS1118 Texas Instruments Incorporated). Thus, an increased resolution of 0.2 mV is reached. Nonetheless, to analyze the potential of the LM in accordance with the movement of the robot, an on-board SD-card module was added. An algorithm loaded on Arduino was build to dictate the movement of the robot, which is simple and aiming to illustrate the applicability of this method. After an initialization of the robot for 3 s during its activation, the electrical potential values are read via the A–D converter, saved on the SD and then the appropriate movement for the robot is executed. Given that inside a BZ LM oxidation wave-fronts proceed from one area to another, they pass through the electrodes and an oscillating electrical potential is registered. When illumination is applied to the surface of the LM the dynamics of the wave-fronts alter, thus the electrical potential alters as well in accordance with the external stimulation of the BZ LM. To prove this in practice, a stationary BZ LM (Figure 13.27(a)) was exposed on a laser beam. The results are presented in Figure 13.28. While the electrical potential was oscillating with a mean value of ca. −7 mV, the light stimulation was applied on a positive peak of the oscillation. As observed in Figure 13.28(a) the oscillation is temporarily interrupted and smaller amplitudes are observed, before the values climb again to their previous peaks. For another instance, with the oscillation around 0 mV, as the one depicted in Figure 13.28(b), a rapid decrease of voltage is observed as a response to the external stimulus. Despite

page 409

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

410

BZ marble output (mV)

5

0

–5

–10

–15

Laser ON Laser OFF

–20 2100

2150

2200

2250

2300

2350

Time (seconds) 30 Laser ON Laser OFF

BZ marble output (mV)

20

10

0

–10

–20 950

1000

1050

1100

1150

1200

1250

1300

1350

1400

1450

Time (seconds)

Figure 13.28. Dynamics of electrical potential record from BZ LMs via Pico ADC-24. Moments when the LM was illuminated with a laser are shown by vertical lines.

that the oscillation resumes with similar peaks and frequency as before the application of the stimulus. Taking into consideration the results of these tests, the algorithm handling the robot was designed to detect whether the values of electrical potential are positive or negative. Namely, if the potential

page 410

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

411

is positive then the robot turns left, otherwise it turns right. Nonetheless, a threshold of an absolute value of 1 mV was set to prevent unwanted movements because of parasitic noise. Readings of the electrodes inserted to the BZ LM was scanned every 2 s, used by the Arduino to handle the robot and saved on the SD card to be analyzed after the execution of each experiment. After mounting the BZ LM on the robot and connecting the electrodes with the Arduino controlling the robot (through the A–D converter) (Figure 13.27(d)), an experiment of the behavior of the hybrid controller of the robot was executed. The external stimulus was not applied for the first instance. The electrical potential of the BZ LM as recorded by the on-board SD card is presented in Figure 13.29(a). The oscillation is around the value of zero with a period of ca. 30 s. Note here that the reading moments (each 2 s) can be spotted in the graph and are characterized with asterisks for positive values (left turn of the robot), squares for negative values (left turn of the robot) and blue circles for the values below the threshold (then no movement of the robot is dictated). To better visualize the trajectory of the robot, right turns are depicted as red positions, while left turns as green ones, in Figure 13.29(b) and can be considered as a roughly straight line (after an initial small right turn). The roughly straight line could be expected given the oscillation around zero. Note that the starting position is the lower center area. In the following an experiment to test the effect of light stimulus on the BZ LM robot controller was executed. The illumination of the LM was performed by a human operator with a laser beam for ca. 10 s each time. Note here that the moments of application of stimulus were chosen randomly as no feedback mechanism was build in the configuration (SD card was accessed after the finalization of the experiment). The oscillation of the electrical potential and the moments (and duration) of applying the light stimulus are presented in Figure 13.30(a). Here the values on the oscillation graph are characterized as in the previous case, and the initiation of the stimulus is depicted as vertical red dashed lines, whereas the moment that the stimulus is removed with black solid lines.

page 411

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

412 0.05 0.04

Output (volts)

0.03 0.02 0.01 0 –0.01 –0.02 –0.03 0

10

20

30

40

50

60

70

80

Time (seconds)

(a)

(b)

Figure 13.29. Results of one experiment (with no stimulation using the laser beam). (a) Voltage output of the BZ LM and (b) trajectory of the robot. Supplementary Video BZRobot19.mp4 at.93

page 412

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

Sensing and Computing with Liquid Marbles

413

0.04 Laser ON Laser OFF

0.03

Output (volts)

0.02

0.01

0

–0.01

–0.02 0

20

40

60

80

100

120

Time (seconds)

(a)

(b)

Figure 13.30. Results of one experiment (with stimulation with the laser beam). (a) Voltage output of the BZ LM and (b) trajectory of the robot. Supplementary Video BZRobot23.mp4 at Ref. [93].

page 413

August 4, 2021

414

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

It can be observed in Figure 13.30(a) that the two first applications of the stimulus affects the oscillations (differentiated from oscillations without stimulus as in Figures 13.28 and 13.29(a)). Consequently, as portrayed in the trajectory of the robot (Figure 13.30(b)), it follows a left turn (anti-clockwise — coded green) move for a longer than the usual time (period of 30 s realized in previous experiments) and then follows a shifting between turning right and left with normal frequency. Nonetheless, the rest of the applications of light stimulus, does not appear to affect the electrical potential oscillation, hence, the movement of the robot. The fact that the electrodes used are inserted throughout the LM and into the plastic container, causes the LM to be fixed in position during robot movement and not able to mix and reset its state of reactants. This can be realized by the same periods of oscillations in the static experiment (Figure 13.28) and on-board the moving robot experiment (Figure 13.29) with the contribution of the vibrations caused by the sudden starting and stopping. As a consequence, speed restrictions of previous prototypes of BZ controlled robots46 (forward speeds of ca. 1 cm/s and rotation speeds of ca. 1◦ /s) are eased. In the experiments executed in this study, the algorithm dictates the robot to move 1.2 cm forward (during 0.2 s) and turn to either direction at an angle of 3 degrees (during 0.2 s). Moreover, because of the electrical interface between the BZ LM and the Arduino board, no complicated custom-build hardware or image processing algorithms were used, as previously.46, 94 This evolution to controller systems of lower complexity accelerates the progress towards future unconventional and soft robotics. 13.7. Neuromorphic Marbles Previous sections have demonstrated how LMs are a malleable unconventional substrate for implementing simulacra of conventional, solid state components. It has been argued however, notably by Stepney,95 that attempting to emulate conventional architectures and components, which have had over half a century of intensive

page 414

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

415

development, is highly unlikely to produce devices of comparable utility. In this section, we explore the concept of using LMs as an unconventional substrate for implementing unconventional computing systems, in which we present the “neuromorphic” LM.96 A system is said to be neuromorphic if it mimics the structure and/or functionality of the metazoan brain, nervous system, or nerve cells (neurons). As it is widely considered that electrochemical activity within neural systems — intracellular transduction and transmission of signals, which is modulated by plastic intercellular (synaptic) junctions — supports the wide range of emergent phenomena that we may broadly refer to as “intelligence”, emulation of these lowpower, massively-parallel characteristics is the primary goal in the design of a neuromorphic system.95, 97, 98 Our goal was to create a LM with memory, which was achieved through using repeated electrical stimulation to “train” the marble into adopting either a high or low electrical resistance state, towards demonstrating their potential in “smart” LM systems ranging from artificial brains to lab-on-a-chip applications. We opted to use carbon nanotubes (CNTs) as the active component in our neuromorphic LMs, following a rich pedigree of publications demonstrating their switching and synapse-like modulatory behaviors.99–104 Single-walled CNTs were used as LM cores in a 1.0 mg mL−1 deionized water suspension which also contained 1% (w/v) of a surfactant, Triton X-100. Electron microscopy revealed the CNTs to have a mean diameter of 20 nm (n = 50); their length was variable, but did not range below 1.0 μm (Figure 13.31(a)). Micro-scale copper flakes, with a mean diameter of 76 μm (n = 838) (Figure 13.31(b)), were used as the LM coating as the application required high electrical conductivity, in addition to hydrophobicity. LM volume was 100 μL in all experiments; synthesis of LMs was achieved through the rolling method, although the presence of surfactant in their fluid interiors caused their shape to be distinctly flattened.

page 415

August 4, 2021

416

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

(a)

(b)

Figure 13.31. Electron micrographs of neuromorphic LM materials. (a) Singlewalled CNTs. (b) Copper flakes.

LMs were transferred to a bespoke electrical measurement environment comprising two cup-shaped Ag/AgCl electrodes, where one was used to hold the LM and the other was lowered into position via a clamp stand (Figure 13.32). All electrical stimulation and measurement was done via a Keithley Source-Measure Unit 2450 (Keithley, USA), using a 4-to-2 direct current electrode configuration.

page 416

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

(a)

b4205-v2-ch13

417

(b)

Figure 13.32. Apparatus for electrical characterization. (a) Schematic diagram, where LM: liquid marble, E: electrodes, K: Keithley Sourcemeter, CS: clamp stand. (b) Photograph of apparatus. Scale bar 10 mm.

Our initial experiments consisted of repeatedly stimulating the CNT LMs with 3.0 V pulses (1.0 s pulses, 0.5 s ramp time, 0.1 A current limit), followed by a delay (at 0 V) for the same duration. This was repeated 375 times, before the polarity was inverted and the pattern of stimulation continued at −3.0V. The full experiment (n = 10) consisted of four of these phases, which are here named p1 → p4 (i.e., ((3.0 V × 375) → (−3.0 V × 375))×2), over a total duration of 50 min. Supplementary experiments consisted of running double-sided 3.0 V I–V sweeps, using the same equipment and a 0.1 s dwell time. All experiments were supported by controls in which (a) liquid marbles filled with deionized water and deionized water with 1% surfactant, and (b) 100 μL samples of water, water/surfactant and CNT solutions were tested in open wells, all using the same measurement parameters. Our primary finding was that entrainment as per the methods above generated a switching behavior which we dubbed the “neuromorphic switching effect” (NSE) (Figure 13.33(a)). Repeatedly stimulating the CNT-filled LMs with 3.0 V during p1 caused them to assume a high-resistance state; reversing the polarity during p2 was found to cause a statistically-significant (p < 0.001) reduction in

page 417

August 4, 2021

418

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

(a)

(b)

Figure 13.33. Electrical characterization of CNT-filled LMs. (a) Electrical resistance of a typical CNT LM between stimulation phases, where p1 : red, p2 : green, p3 : orange, p4 : blue, asterisk: NSE onset. (b) Typical I–V sweep profile from a CNT LM exposed to a 3.0 V double-sided sweep. Arrows indicate the direction of the sweep.

page 418

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

419

resistance about half-way through the phase (NSE onset). The mean x = 47 kΩ, resistance during p1 was subject to substantial variation (¯ s = 58 kΩ), likely due to the differing size of the top electrode contact patch, but the median percentage reduction in resistance between p1 → p2 was found to be 98%. Time to NSE onset was also variable, ranging from 11 s to 309 s. On switching polarity again during p3 , the LMs then rapidly reverted back to a high resistance state, whose value was again highly variable but was typically two orders of magnitude greater than the the resistance in p2 . During the final phase, p4 , the LMs’ resistance dropped into a low state again; there was no significant difference in resistance between phases p2 and p4 , but onset of the NSE was significantly quicker (¯ x = 43%, p < 0.000) in the latter. These results demonstrate that not only did these LMs exhibit switching behavior and memory of past states, but also responded to stimuli quicker after multiple rounds of entrainment (potentiation). These phenomena are all core to the concept of synaptic plasticity and hence show how our LMs may be thought of as neuromorphic. By extension, each CNT-filled LM can be considered as a soft, non-biological synapse. In further support of this, subsequent experiments using I–V sweeps revealed that these LMs were also memristive, as indicated by their characteristic pinched-loop hysteresis (Figure 13.33(b)). The NSE was not observed in any control experiments and we ascertained in brief further experiments that it could be replicated with different electrode materials, including copper plates. Whilst our experiments did not elucidate the mechanism underlying the NSE, our analysis of relevant literature105–109 led us to hypothesize that repeated stimulation of the LMs caused surface charging on the CNTs within, which resulted in their being repelled from the metallic coating under one polarity and attracted during the other, where extra electrical connections were made across the surface of the LM and hence reducing its electrical resistance during phases p2 and p4 (Figure 13.34).

page 419

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

420

(a)

(b)

Figure 13.34. Diagram illustrating hypothetical model for NSE, shown in 2D quarter-circle cross section (not to scale). (a) System during phases p1 , p3 . (b) System during phases p2 , p4 ; attraction of the CNTs to the copper creates more electrical connections and hence decreases the resistance of the coating.

13.8. Discussion The use of LMs for collision-based computation has many advantages (additional degrees of freedom) over previous approaches. Due to their nature, it is possible to carry cargo in the LMs, which adds an additional dimension to the calculations. It is also possible to initiate chemical reactions within marbles by their coalescence.110 By varying the diameters of the LMs, different sizes can represent different values, and will have different relative trajectories — removing the limitations of a binary system. Use of a magnet can remove the coating from magnetic marbles, which (if done on a superhydrophobic surface) can roll freely down a slope as droplets, before being reformed using a different coating. Compared to droplet computing,33, 111 only a tiny portion of the circuit needs to be treated hydrophobic, making larger and more complicated circuits easier and cheaper to construct. These points, combined with LM’s ability to be easily merged, levitated,110 divided,38 opened/closed,112 and easily propelled by a variety of methods make LMs a fascinating and potentially prosperous addition to the unconventional computing family. This work demonstrates that the BZ reaction can be directly incorporated into the electronic circuitry of a controller for a robot.

page 420

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

421

Limitations imposed by earlier prototypes of liquid phase controllers, where robots were restricted to forward speeds of ca. 1cm/s and rotation speeds of ca. 1 degree/s,46 were alleviated. The additional benefits of the BZ LM system were that no optical interfaces were required to monitor the BZ LM controller and the geometries of the oxidation wave-fronts no longer need to be analyzed. Hardware and software used in previous versions of the robot46, 94 (a light placed underneath the reaction contained within a Petri dish, a serial connection to a PC and image processing algorithms) are not necessary as the BZ LMs are electrically connected to the microcontroller that delivers the trajectory of the robot. This reduction in the complexity of the controller system shows progress towards future unconventional and soft robotics. We find the prospect of neuromorphic LMs as hugely exciting as even single memristive devices can exhibit complex spiking behavior that supports combinatorial logical operations (e.g., adders),113, 114 hence this technology allows the unique utility of LMs to be enhanced through allowing for greater complexity of computation per device. Further work should expand on this theme in addition to refining the engineering of LM production and electrical characterization. Acknowledgments This research was supported by the EPSRC with grant EP/ P016677/1 “Computing with Liquid Marbles”. References 1. P. Aussillous and D. Quéré, Liquid marbles. Nature 411(6840), 924 (2001). doi: 10.1038/35082026. 2. P. Aussillous and D. Quéré, Properties of liquid marbles. Proc. Roy. Soc. Lond. A: Math. Phys. Eng. Sci. 462(2067), 973–999 (2006). doi: 10.1098/ rspa.2005.1581. 3. E. Bormashenko, R. Pogreb, R. Balter, H. Aharoni, D. Aurbach, and V. Strelnikov, Liquid marbles containing petroleum and their properties. Pet. Sci. 12(2), 340–344 (2015). doi: 10.1007/s12182-015-0016-y. 4. P. S. Bhosale, M. V. Panchagnula, and H. A. Stretz, Mechanically robust nanoparticle stabilized transparent liquid marbles. Appl. Phys. Lett. 93(3), 034109 (2008). doi: 10.1063/1.2959853.

page 421

August 4, 2021

422

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

5. S. Asare-Asher, J. N. Connor, and R. Sedev, Elasticity of liquid marbles. J. Colloid Interface Sci. 449(0), 341–346 (2015). http://dx.doi.org/10. 1016/j.jcis.2015.01.067. 6. C. Fullarton, T. C. Draper, N. Phillips, R. Mayne, B. P. J. de Lacy Costello, and A. Adamatzky, Evaporation, lifetime, and robustness studies of liquid marbles for collision-based computing. Langmuir. 34(7), 2573–2580 (2018). doi: 10.1021/acs.langmuir.7b04196. 7. S. Chandan, S. Ramakrishna, K. Sunitha, M. Satheesh Chandran, K. S. Santhosh Kumar, and D. Mathew, pH-responsive superomniphobic nanoparticles as versatile candidates for encapsulating adhesive liquid marbles. J. Mater. Chem. A 5(43), 22813–22823 (2017). doi: 10.1039/ C7TA07562F. 8. N. M. Oliveira, R. L. Reis, and J. F. Mano, The potential of liquid marbles for biomedical applications: A critical review. Adv. Healthc. Mater. 6(19), 1700192 (2017). doi: 10.1002/adhm.201700192. 9. N.-T. Nguyen, M. Hejazian, C. Ooi, and N. Kashaninejad, Recent advances and future perspectives on microfluidic liquid handling. Micromachines 8(6), 186 (2017). doi: 10.3390/mi8060186. 10. Y. Zhang, S. Park, K. Liu, J. Tsuan, S. Yang, and T.-H. Wang, A surface topography assisted droplet manipulation platform for biomarker detection and pathogen identification. Lab Chip 11(3), 398–406 (2011). doi: 10.1039/ C0LC00296H. 11. R. B. Fair, Digital microfluidics: Is a true lab-on-a-chip possible? Microfluid. Nanofluidics 3(3), 245–281 (2007). doi: 10.1007/s10404-007-0161-8. 12. Z. Guttenberg, H. M¨ uller, H. Haberm¨ uller, A. Geisbauer, J. Pipper, J. Felbel, M. Kielpinski, J. Scriba, and A. Wixforth, Planar chip device for PCR and hybridization with surface acoustic wave pump. Lab Chip 5(3), 308–317 (2005). doi: 10.1039/B412712A. 13. B. Peter, “AND” gate (1965). US Patent 3,191,611. 14. C. A. Belsterling, Fluidic Systems Design. (John Wiley & Sons, 1971). 15. A. Adamatzky, Collision-based computing in Belousov–Zhabotinsky medium. Chaos, Solitons & Fractals 21(5), 1259–1264 (2004). 16. A. Adamatzky and B. de Lacy Costello, Binary collisions between wavefragments in a sub-excitable Belousov–Zhabotinsky medium. Chaos, Solitons & Fractals 34(2), 307–315 (2007). 17. R. Toth, C. Stone, B. de Lacy Costello, A. Adamatzky, and L. Bull, Simple collision-based chemical logic gates with adaptive computing. Theoret. Technol. Adv. Nanotechnol. Mol. Comput. Interdisc. Gains: Interdisc. Gains. 162 (2010). doi: 10.4018/978-1-60960-186-7.ch011. 18. A. Adamatzky, B. De Lacy Costello, L. Bull, and J. Holley, Towards arithmetic circuits in sub-excitable chemical media. Israel J. Chem. 51(1), 56–66 (2011). 19. R. Mayne and A. Adamatzky, On the Computing potential of intracellular vesicles. PLoS One 10(10), e0139617 (2015). doi: 10.1371/journal.pone. 0139617.

page 422

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

423

20. Y.-P. Gunji, Y. Nishiyama, A. Adamatzky, T. E. Simos, G. Psihoyios, C. Tsitouras, and Z. Anastassi, Robust soldier crab ball gate. Complex Syst. 20(2), 93 (2011). 21. E. R. Berlekamp, J. H. Conway, and R. K. Guy, Winning Ways, for Your Mathematical Plays: Games in Particular, Vol. 2, (Academic Press, 1982). 22. A. Adamatzky, B. de Lacy Costello, and L. Bull, On polymorphic logical gates in subexcitable chemical medium. Int. J. Bifurc. Chaos 21(07), 1977– 1986 (2011). 23. E. Fredkin and T. Toffoli. Conservative logic. In Collision-based Computing (Springer, 2002), pp. 47–81. 24. T. Toffoli. Reversible computing. In International Colloquium on Automata, Languages, and Programming, (Springer, 1980), pp. 632–644. 25. A. De Vos, Reversible Computing: Fundamentals, Quantum Computing, and Applications (John Wiley & Sons, 2011). 26. C. H. Bennett, Notes on the history of reversible computation. IBMs J. Res. Develop. 32(1), 16–23 (1988). 27. G. P. Berman, G. D. Doolen, D. D. Holm, and V. I. Tsifrinovich, Quantum computer on a class of one-dimensional Ising systems. Phys. Lett. A 193(5– 6), 444–450 (1994). 28. A. Barenco, C. H. Bennett, R. Cleve, D. P. DiVincenzo, N. Margolus, P. Shor, T. Sleator, J. A. Smolin, and H. Weinfurter, Elementary gates for quantum computation. Phys. Rev. A 52(5), 3457 (1995). 29. J. A. Smolin and D. P. DiVincenzo, Five two-bit quantum gates are sufficient to implement the quantum Fredkin gate. Phys. Rev. A 53(4), 2855 (1996). 30. S.-B. Zheng, Implementation of Toffoli gates with a single asymmetric Heisenberg x y interaction. Phys. Rev. A. 87(4), 042318 (2013). 31. N. Kostinski, M. P. Fok, and P. R. Prucnal, Experimental demonstration of an all-optical fiber-based Fredkin gate. Opt. Lett. 34(18), 2766–2768 (2009). 32. N. Margolus. Universal cellular automata based on the collisions of soft spheres. In A. Adamatzky (ed.), Collision-based Computing (Springer, 2002), pp. 107–134. 33. H. Mertaniemi, R. Forchheimer, O. Ikkala, and R. H. A. Ras, Rebounding droplet-droplet collisions on superhydrophobic surfaces: From the phenomenon to droplet logic, Adv. Mater. 24(42), 5738–5743 (2012). doi: 10.1002/adma.201202980. 34. S. Fujii, S. Kameyama, S. P. Armes, D. Dupin, M. Suzaki, and Y. Nakamura, pH-responsive liquid marbles stabilized with poly(2-vinylpyridine) particles. Soft Matter. 6(3), 635–640 (2010). doi: 10.1039/b914997j. 35. E. Bormashenko, Y. Bormashenko, R. Grynyov, H. Aharoni, G. Whyman, and B. P. Binks, Self-propulsion of liquid marbles: Leidenfrost-like levitation driven by marangoni flow. J. Phys. Chem. C 119(18), 9910–9915 (2015). doi: 10.1021/acs.jpcc.5b01307. 36. E. Bormashenko, Liquid marbles, elastic nonstick droplets: From minireactors to self-propulsion. Langmuir 33(3), 663–669 (2017). doi: 10.1021/acs. langmuir.6b03231.

page 423

August 4, 2021

424

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

37. P. S. Bhosale and M. V. Panchagnula, Sweating liquid micro-marbles: Dropwise condensation on hydrophobic nanoparticulate materials. Langmuir 28(42), 14860–14866 (2012). doi: 10.1021/la303133y. 38. E. Bormashenko and Y. Bormashenko, Non-stick droplet surgery with a superhydrophobic scalpel. Langmuir 27(7), 3266–3270 (2011). doi: 10.1021/ la200258u. 39. A. Adamatzky, Binary full adder, made of fusion gates, in a subexcitable Belousov-Zhabotinsky system. Phys. Rev. E 92(3), 032811 (2015). 40. B. P. Belousov, A periodic reaction and its mechanism. Compil. Abstr. Radiation Med. 147(145), 1 (1959). 41. A. Zhabotinsky, Periodic processes of malonic acid oxidation in a liquid phase. Biofizika 9(306–311), 11 (1964). 42. L. Kuhnert, A new optical photochemical memory device in a light-sensitive chemical active medium. Nature 319(6052), 393 (1986). 43. L. Kuhnert, K. I. Agladze, and V. I. Krinsky, Image processing using light-sensitive chemical waves. Nature 337(6204), 244–247 (1989). ISSN 0028-0836. doi: 10.1038/337244a0. URL http://www.nature.com/articles/ 337244a0. 44. Y. Igarashi and J. Gorecki, Chemical diodes built with controlled excitable media. IJUC 7(3), 141–158 (2011). 45. O. Steinbock, P. Kettunen, and K. Showalter, Chemical wave logic gates. J. Phys. Chem. 100(49), 18970–18975 (1996). 46. A. Adamatzky, B. de Lacy Costello, C. Melhuish, and N. Ratcliffe, Experimental implementation of mobile robot taxis with onboard Belousov– Zhabotinsky chemical medium. Mater. Sci. Eng. C 24(4), 541–548 (2004). 47. M.-A. Tsompanas, C. Fullarton, and A. Adamatzky, Belousov-zhabotinsky liquid marbles in robot control. Sensors Actuators B: Chem. 295, 194–203 (2019). 48. J. Gorecki, K. Yoshikawa, and Y. Igarashi, On chemical reactors that can count. J. Phys. Chem. A 107(10), 1664–1669 (2003). 49. J. Gorecki and J. N. Gorecka, Information processing with chemical excitations–from instant machines to an artificial chemical brain. Int. J. Unconv. Comput. 2(4) (2006). 50. P. L. Gentili, V. Horvath, V. K. Vanag, and I. R. Epstein, BelousovZhabotinsky “chemical neuron” as a binary and fuzzy logic processor. IJUC 8(2), 177–192 (2012). 51. G. Gruenert, K. Gizynski, G. Escuela, B. Ibrahim, J. Gorecki, and P. Dittrich, Understanding networks of computing chemical droplet neurons based on information flow. Int. J. Neural Syst. 25(07), 1450032 (2015). 52. J. Stovold and S. O’Keefe. Associative memory in reaction-diffusion chemistry. In Advances in Unconventional Computing (Springer, 2017), pp. 141–166. 53. M.-Z. Sun and X. Zhao, Crossover structures for logical computations in excitable chemical medium. Int. J. Unconv. Comput. 11(2), 165–184 (2015).

page 424

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

425

54. W. M. Stevens, A. Adamatzky, I. Jahan, and B. de Lacy Costello, Timedependent wave selection for information processing in excitable media. Phys. Rev. E 85(6), 066129 (2012). 55. J. Gorecki, J. Gorecka, and A. Adamatzky, Information coding with frequency of oscillations in Belousov-Zhabotinsky encapsulated disks. Phys. Rev. E 89(4), 042910 (2014). 56. J. Sielewiesiuk and J. G´ orecki, Passive barrier as a transformer of chemical signal frequency. The J. Phys. Chem. A 106(16), 4068–4076 (2002). 57. J. Gorecka and J. Gorecki, T-shaped coincidence detector as a band filter of chemical signal frequency. Phys. Rev. E 67(6), 067203 (2003). 58. K. Gizynski and J. Gorecki, Chemical memory with states coded in light controlled oscillations of interacting Belousov–Zhabotinsky droplets. Phys. Chem. Chem. Phys. 19(9), 6519–6531 (2017). 59. V. Calayir and L. Pileggi, Fully-digital oscillatory associative memories enabled by non-volatile logic. In The 2013 International Joint Conference on Neural Networks (IJCNN), (IEEE, 2013), pp. 1–6. 60. J. Borresen and S. Lynch, Oscillatory threshold logic. PloS One 7(11), e48498 (2012). 61. P. Baldi and R. Meir, Computing with arrays of coupled oscillators: An application to preattentive texture discrimination. Neural Comput. 2(4), 458–471 (1990). 62. D. E. Nikonov, G. Csaba, W. Porod, T. Shibata, D. Voils, D. Hammerstrom, I. A. Young, and G. I. Bourianoff, Coupled-oscillator associative memory array operation for pattern recognition. IEEE J. Exploratory Solid-State Comput. Dev. Circuits 1, 85–93 (2015). 63. G. P˘ aun, Computing with membranes: An introduction. Bull. EATCS 67, 139–152 (1999). 64. G. P˘ aun, Computing with membranes (P systems): A variant. Int. J. Found. Comput. Sci. 11(1), 167–181 (2000). doi: 10.1142/S0129054100000090. 65. G. P˘ aun and G. Rozenberg, A guide to membrane computing. Theoret. Comput. Sci. 287(1), 73–100 (2002). doi: 10.1016/S0304-3975(02)00136-6. 66. G. P˘ aun, Membrane computing. In Encyclopedia of Complexity and Systems Science (Springer, 2009), pp. 5523–5535. doi: 10.1007/978-0-387-30440-3 328. 67. V. K. Vanag and I. R. Epstein, Pattern formation in a tunable medium: The belousov-zhabotinsky reaction in an aerosol ot microemulsion. Phys. Rev. Lett. 87(22), 228301 (2001). 68. V. K. Vanag, Waves and patterns in reaction–diffusion systems. Belousov– Zhabotinsky reaction in water-in-oil microemulsions. Physics-Uspekhi 47(9), 923 (2004). 69. A. Kaminaga, V. K. Vanag, and I. R. Epstein, A reaction–diffusion memory device. Angewandte Chemie International Edition. 45(19), 3087– 3089 (2006). 70. J. Szymanski, J. Gorecki, and M. J. Hauser, Chemo-mechanical coupling in reactive droplets. The J. Phys. Chem. C 117(25), 13080–13086 (2013).

page 425

August 4, 2021

426

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

71. J. Gorecki, K. Gizynski, J. Guzowski, J. Gorecka, P. Garstecki, G. Gruenert, and P. Dittrich, Chemical computing with reaction–diffusion processes, Philosoph. Trans. Roy. Soc. A: Math. Phys. Eng. Sci. 373(2046), 20140219 (2015). 72. A. Wang, J. Gold, N. Tompkins, M. Heymann, K. Harrington, and S. Fraden, Configurable nor gate arrays from belousov-zhabotinsky microdroplets. The Eur. Phys. J. Spec. Topics 225(1), 211–227 (2016). 73. G. Gruenert, J. Szymanski, J. Holley, G. Escuela, A. Diem, B. Ibrahim, A. Adamatzky, J. Gorecki, and P. Dittrich, Multi-scale modelling of computers made from excitable chemical droplets. IJUC 9(3–4), 237–266 (2013). 74. A. Henson, J. M. P. Gutierrez, T. Hinkley, S. Tsuda, and L. Cronin, Towards heterotic computing with droplets in a fully automated droplet-maker platform. Philos. Trans. Roy. Soc. A: Math. Phys. Eng. Sci. 373(2046), 20140221 (2015). 75. R. J. Field and A. T. Winfree, Travelling waves of chemical activity in the Zaikin-Zhabotinskii-Winfree reagent. J. Chem. Edu. 56(11), 754 (1979). 76. O. Steinbock and S. C. M¨ uller, Radius-dependent inhibition and activation of chemical oscillations in small droplets. The J. Phys. Chem. A 102(32), 6485–6490 (1998). doi: 10.1021/jp981421u. 77. H. Kitahata, R. Aihara, N. Magome, and K. Yoshikawa, Convective and periodic motion driven by a chemical wave. The J. Chem. Phy. 116(13), 5666–5672 (2002). doi: 10.1063/1.1456023. 78. H. Kitahata, N. Yoshinaga, K. H. Nagai, and Y. Sumino, Spontaneous motion of a Belousov–Zhabotinsky reaction droplet coupled with a spiral wave. Chem. Lett. 41(10), 1052–1054 (2012). doi: 10.1246/cl.2012.1052. 79. B. De Lacy Costello, I. Jahan, M. Ahearn, J. Holley, L. Bull, and A. Adamatzky, Initiation of waves in BZ encapsulated vesicles using lightTowards design of computing architectures. Int. J. Unconv. Comput. 9(3–4), 311–326 (2013). 80. A. Adamatzky, C. Fullarton, N. Phillips, B. De Lacy Costello, and T. C. Draper, Thermal switch of oscillation frequency in Belousov–Zhabotinsky liquid marbles. Roy. Soc. Open Sci. 6(4), 190078 (2019). 81. V. G´ a¸sp´ ar, G. Bazsa, and M. Beck, The influence of visible light on the beloasoy-zhabotinskii oscillating reactions applying different catalysts. Zeitschrift f¨ ur Physikalische Chemie. 264(1), 43–48 (1983). 82. I. Hanazaki, Y. Mori, T. Sekiguchi, and G. R´ abai, Photo-response of chemical oscillators. Physica D: Nonlinear Phenomena 84(1–2), 228–237 (1995). 83. N. Rambidi, T.-O. Kuular, and E. Makhaeva, Information-processing capabilities of chemical reaction–diffusion systems. 1. Belousov–Zhabotinsky media in hydrogel matrices and on solid supports. Adv. Mat. Opt. Elect. 8(4), 163–171 (1998). 84. R. T´ oth, V. G´ asp´ ar, A. Belmonte, M. C. O’Connell, A. Taylor, and S. K. Scott, Wave initiation in the ferroin-catalysed belousov–zhabotinsky reaction with visible light. Phys. Chem. Chem. Phys. 2(3), 413–416 (2000).

page 426

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

427

85. Y. Wang, Y. Xie, C. Yuan, H. Wang, and D. Fu, Intelligent image sensor based on probing the evolution of redox potentials distributed in reaction– diffusion medium. Sensors Actuators B: Chem. 145(1), 285–292 (2010). 86. M. J. Blandamer and S. H. Morris, Investigation into the effect of temperature and added t-butyl alcohol on the dynamic properties of the belousov reaction. J. Chem. Soc., Faraday Trans. 1: Phys. Chem. Condensed Phases 71, 2319–2330 (1975). 87. T. Vajda, A. Rockenbauer, and M. GyHor, Cryo-oscillations. BelousovZhabotinskii (bz) oscillations in frozen and undercooled solution. Int. J. Chem. Kinetics 20(8), 661–665 (1988). 88. L. Szirovicza, I. Nagyp´ al, and I. B´ ardi, Propagating reaction front in ‘frozen’ phase. Int. J. Chem. Kinetics 23(1), 99–101 (1991). 89. M. Masia, N. Marchettini, V. Zambrano, and M. Rustici, Effect of temperature in a closed unstirred Belousov–Zhabotinsky system. Chem. Phys. Lett. 341(3–4), 285–291 (2001). 90. T. B´ ans´ agi Jr, M. Leda, M. Toiya, A. M. Zhabotinsky, and I. R. Epstein, High-frequency oscillations in the Belousov-Zhabotinsky reaction. The J. Phys. Chem. A 113(19), 5644–5648 (2009). 91. Y. Ito, M. Nogawa, and R. Yoshida, Temperature control of the BelousovZhabotinsky reaction using a thermoresponsive polymer. Langmuir 19(23), 9577–9579 (2003). 92. Pololu Corporation. Pololu Zumo Shield for Arduino User’s Guide (2016 (accessed July 16, 2018)). https://www.pololu.com/docs/pdf/0J57/zumo shield for arduino.pdf. 93. M.-A. Tsompanas, C. Fullarton, and A. Adamatzky. Videos of a robot controlled by Belousov-Zhabotinsky liquid marbles. Supplementary materials to the paper “Belousov-Zhabotinsky liquid marbles in robot control”. (2018). 94. H. Yokoi, A. Adamatzky, B. de Lacy Costello, and C. Melhuish, Excitable chemical medium controller for a robotic hand: Closed-loop experiments. Int. J. Bifur. Chaos. 14(09), 3347–3354 (2004). 95. S. Stepney, The neglected pillar of material computation. Physica D: Nonlinear Phenomena 237, 1157–1164 (2008). doi: 10.1016/j.physd.2008. 01.028. 96. R. Mayne, T. C. Draper, N. Phillips, J. G. H. Whiting, R. Weerasekera, C. Fullarton, B. P. J. de Lacy Costello, and A. Adamatzky, Neuromorphic liquid marbles with aqueous carbon nanotube cores. Langmuir 35(40), 13182–13188 (2019). doi: 10.1021/acs.langmuir.9b02552. 97. B. Sengupta and M. B. Stemmler, Power consumption during neuronal computation. Proc. IEEE 102(5), 738–750 (2014). doi: 10.1109/JPROC. 2014.2307755. 98. A. Citri and R. Malenka, Synaptic plasticity: Multiple forms, functions, and mechanisms. Neuropsychopharmacology 33(18–41) (2008). doi: 10.1038/sj. npp.1301559.

page 427

August 4, 2021

428

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch13

A. Adamatzky et al.

99. J. Cui, R. Sordan, M. Burghard, and K. Kern, Carbon nanotube memory devices of high charge storage stability. Appl. Phys. Lett. 81(17), 3260–3262 (2002). 100. S. Heinze, J. Tersoff, R. Martel, V. Derycke, J. Appenzeller, and P. Avouris, Carbon nanotubes as schottky barrier transistors. Physical Review Lett. 89(10), 106801 (2002). 101. P. Avouris, Molecular electronics with carbon nanotubes. Acc. Chem. Res. 35(12), 1026–1034 (2002). 102. M. R. Diehl, D. W. Steuerman, H.-R. Tseng, S. A. Vignon, A. Star, P. C. Celestre, J. F. Stoddart, and J. R. Heath, Single-walled carbon nanotube based molecular switch tunnel junctions. ChemPhysChem. 4(12), 1335–1339 (2003). 103. K. Kim, C.-L. Chen, Q. Truong, A. M. Shen, and Y. Chen, A carbon nanotube synapse with dynamic logic and learning. Adv. Mater. 25(12), 1693–1698 (2013). 104. S. Kim, B. Choi, M. Lim, J. Yoon, J. Lee, H.-D. Kim, and S.-J. Choi, Pattern recognition using carbon nanotube synaptic transistors with an adjustable weight update protocol. ACS Nano 11(3), 2814–2822 (2017). 105. X. Sun, T. Chen, Z. Yang, and H. Peng, The alignment of carbon nanotubes: An effective route to extend their excellent properties to macroscopic scale. Acc. Chem. Res. 46(2), 539–549 (2012). doi: 10.1021/ar300221r. 106. G. Cellot, E. Cilia, S. Cipollone, V. Rancic, A. Sucapano, S. Giodarni, L. Gambazzi, H. Markram, M. Grandolfo, D. Scaini and others, Carbon nanotubes might improve neuronal performance by favouring electrical shortcuts. Nat. Nanotech. 4(2), 126 (2009). 107. P. Poncharal, Z. Wang, D. Ugarte, and W. A. De Heer, Electrostatic deflections and electromechanical resonances of carbon nanotubes. Science 283(5407), 1513–1516 (1999). 108. I. O. Maciel, N. Anderson, M. A. Pimenta, A. Hartschuh, H. Qian, M. Terrones, H. Terrones, J. Campos-Delgado, A. M. Rao, L. Novotny et al., Electron and phonon renormalization near charged defects in carbon nanotubes, Nat. Mater. 7(11), 878 (2008). 109. R. Matsunaga, K. Matsuda, and Y. Kanemitsu, Observation of charged excitons in hole-doped carbon nanotubes using photoluminescence and absorption spectroscopy, Physical Review Letters 106(3), 037404 (2011). 110. Z. Chen, D. Zang, L. Zhao, M. Qu, X. Li, X. Li, L. Li, and X. Geng, Liquid marble coalescence and triggered microreaction driven by acoustic levitation. Langmuir 33(25), 6232–6239 (2017). doi: 10.1021/acs.langmuir. 7b00347. 111. G. Katsikis, J. S. Cybulski, and M. Prakash, Synchronous universal droplet logic and control. Nat. Phys. 11(7), 588–596 (2015). doi: 10.1038/ nphys3341. 112. Y. Zhao, J. Fang, H. Wang, X. Wang, and T. Lin, Magnetic liquid marbles: Manipulation of liquid droplets using highly hydrophobic Fe3O4 nanoparticles. Adv. Mater. 22(6), 707–710 (2010). doi: 10.1002/adma.200902512.

page 428

August 4, 2021

16:45

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Sensing and Computing with Liquid Marbles

b4205-v2-ch13

429

113. E. Gale, Neuromorphic computation with spiking memristors: Habituation, experimental instantiation of logic gates and a novel sequence-sensitive perceptron model. Faraday Discussions 213, 521–551 (2019). doi: 10.1039/ C8FD00111A. 114. N. Diederich, T. Bartsch, H. Kohlstedt, and M. Ziegler, A memristive plasticity model of voltage-based stdp suitable for recurrent bidirectional neural networks in the hippocampus. Sci. Rep. 8(9367) (2018). doi: 10. 1038/s41598-018-27616-6.

page 429

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch14

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235740 0014

Chapter 14

Towards Colloidal Droplet Processors Alessandro Chiolerio∗,†,‡ and Andrew Adamatzky† ∗

Center for Sustainable Future Technologies, Istituto Italiano di Tecnologia, Via Livorno 30, 10144 Torino, Italy † Unconventional Computing Lab, University of West of England, Bristol, UK ‡ [email protected] Starting from current limitations to the development of conventional, von Neumann architectures, mainly due to high thermal density, issues regarding device integration and energy demand, we introduce a colloid droplet processor and envisage original in materio implementation schemes to create the processor’s prototypes. Said technology offers unique features in terms of energetic sustainability, particularly important when dealing with big data, and can be seen as computing in a droplet. Existing biological analogues have also been highlighted.

14.1. Background Currently, commercial calculators perform Boolean logic operations thanks to a massive development of Silicon technology, in particular complementary metal–oxide-semiconductor (CMOS) technology. We are experiencing an under-exponential development of the figures of merit that for half a century the well known Moore’s law, predicting the doubling of the number of transistors placed in a single die, pointed out for the definition of the developmental industrial and applied research roadmaps.1 Other technologies are rising, such as quantum computation, promising to offer ground-breaking capabilities per se, but without a proper impact on the consumer market. Currently a quantum computer requires complex means to control 431

page 431

August 3, 2021

432

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch14

A. Chiolerio and A. Adamatzky

the temperature in the mK range, with clear consequences on the energy requirements, eventually mitigated if one compares its energy efficiency with classical processors, and also on the portability, far from being possible. Internet of the Things (IoT), speech and face recognition, autonomous driving, personal connectivity, etc. produce a tremendous stream of data to be elaborated by Artificial Intelligence (AI)-enabled systems. We can therefore fully justify our effort in proposing the sustainable computation in a droplet. Looking with greater attention to the diverse aspects of computational systems, we can divide into three main domains the issues one has to cope with: data throughput, heat dissipation, and energy consumption (Figure 14.1). 14.1.1. Data throughput It has been calculated that data flux is approximately an invariant of the communication scale, interestingly regardless of the typical distance of a data link, nowadays we move approximately 100 TB/s. This number also features a current annual growth rate (CAGR)

Figure 14.1. Current challenges in von Neumann computing: energy, big data handling and heat dissipation, sketched in an artistic vision.

page 432

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Towards Colloidal Droplet Processors

b4205-v2-ch14

433

of about 20%.1 Fiber optics and photon-mediated communication channels will necessarily become more important, particularly for the reduction of transfer losses. 14.1.2. Heat dissipation The heat released by charging and discharging parasitic capacitances of transistors requires powerful active cooling technologies and consequently impacts on energy efficiency. Central Processing Units (CPUs) and Graphical Processing Units (GPUs) have reached particularly high thermal energy densities, approaching that of the Sun surface (10 kWcm−2 ).2 An interesting solution to provide passive cooling is that of using enhanced heat capacity liquids such as ferrofluids (FFs).3 Spintronics also holds important improvements, as working on the spin of the carriers rather than on their charge, we could avoid switching devices and could flip their spin with faster processes that do not generate that much heat.4 14.1.3. Energy consumption Let us start discussing the relevance of energetic expenses for computational purposes. A quantitative analysis would need to take into account current sources of consumption: personal computers, smartphones, cloud data storage, network data distribution, workstations, mainframes, and supercomputers, including secret organizations and military usage. It is our belief that the easiest approach to perform such an estimate is to connect installed computational power capabilities to the global gross product.a At the end of 2015 global computation capability was in the range 2 × 1020 –1.5 × 1021 Floating Point operation Per Second (FLOPS). Considering a CAGR of around 20% we end up with approximately 5 × 1020 –3.7 × 1021 FLOPS as of 2020. At the time of writing, the most powerful non-distributed computing system is Fugaku supercomputer, property of Riken Center for Computational Science. It features a maximum computational a

aiimpacts.org/global-computing-capacity/.

page 433

August 3, 2021

434

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch14

A. Chiolerio and A. Adamatzky

capability of 537 PFLOPS in double precision burst mode (capability that surpasses the Exascale in integer precision, achieving an impressive 4.30 EFLOPS in burst mode) and a peak power consumption of 40 MW, so that we can estimate each double precision operation to require 7.45 nJ.b This result is far from efficient, if compared to commercial devices powering personal computers (in the range of 100 aJ) or current estimates for biological computation (10 fJ/spike). One can imagine to what extent computing energy issues impact human society and the environment, including extraction, displacement and valorization of energetic resources, exploitation of renewables, emission of pollutants, and so on. At the time of writing it was not possible to evaluate energy efficiency for quantum computing systems in usage. 14.2. Liquid Droplet Computer We have now to introduce a concept that could mislead the reader: when talking about liquids, one immediately imagines that we are talking about the liquid physical aggregation state, about a pure substance, a colloidal suspension, eventually a solution. In computer science, liquid refers to particular properties a system has towards computation, either a software or a hardware. For example a Liquid Droplet Computer (LDC) is a computational model often used for modelling biological circuits (see Figure 14.2).5 A LDC projects a low dimension spatiotemporal input space on a high dimension configurational space, represented by the degrees of freedom of the LDC itself. This high number of thermodynamic configurations form a space that should be conservative (meaning that it preserves time symmetry) and separative (meaning that it responds to different stimuli in a measurably different way).6 LDCs are a special case of reservoir computing (RC) models, belonging to artificial neural networks (ANNs).7 Two examples of natural LDCs are the colonies of ants and the immune system, where single agents have unlimited or limited freedom to move, respectively, and yet they are capable of a b

www.r-ccs.riken.jp/en/fugaku/project/outline.

page 434

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Towards Colloidal Droplet Processors

b4205-v2-ch14

435

Figure 14.2. In materio computing through a lysosome, using a Random Optical Machine. On the left a sensitive screen representing a Charge Coupled Device, in the middle the lysosome, on the right two plane waves propagating from a Spatial Light Modulator, sketched in an artistic vision.

collective behavior that cannot be predicted by studying the features of individual components of the network.8 LDCs are extremely stable, for example the interference pattern of waves produced on a freelyto-oscillate surface of water were used to perform the XOR Boolean operation for the specific task of speech recognition.9 14.2.1. Holonomic processors Optimization of ANNs, the so-called “learning”, is often disregarded as imperative and fundamental part of the computational processes associated to AI. It represents perhaps the most important, energydemanding and time-consuming part of computation. Neuromorphic computing comes to reduce this impact, and quite recently optical or photonic computing has been demonstrated to be an extremely powerful tool. Here, the idea is to use electromagnetic waves, like Wi-Fi fields, polaritons, lasers, Ising machines, and so on. Among results that have been achieved, worth mentioning the use of a hydrogel medium to promote tunable interaction between non-overlapped

page 435

August 3, 2021

436

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch14

A. Chiolerio and A. Adamatzky

laser beams.10 This is a position brick in the wall of new in materio computational platforms implemented by using non segregated soft matrices, like for example hydrogels, or colloids, as nonlinear media. We prefer to use the name “holonomic”, recalling the works by K. Pribram who first hypothesized and subsequently demonstrated that the human brain operates like a holographic machine in a continuous dynamic resonance between excitatory and depressing waves.11 Massive parallelism could be enabled by interference between optical fields diffracting on the same nonlinear medium, operation that brings us back to the optimization and learning in ANNs. The problem of using nonlinear waves as computing reservoir was solved theoretically by Marcucci et al.12 and implemented on living tumour spheroids to demonstrate a random optical learning machine by Pierangeli et al.13 The physical capabilities of a single synthetic liquid cell was already calculated in the order of 1012 coherence domains per liter, elements that can interact simultaneously when submitted to an optical field, bringing us to the interesting number of 1021 operations per second with an astounding energy consumption in the order of 1 W.1 Why liquid? Because a liquid is incompressible, at least in the engineering range of pressures we can achieve with standard operations, and can adapt easily to accidental changes of shape without losing functionalities: it is fault-tolerant. Furthermore, the amorphous nature of its constituents is such that severe conditions, that could bring conventional systems to breakdown and would require expensive redundancy or fault-tolerant architectures, should not impair its capabilities. For example electric fields, magnetic fields, ionizing radiation, and so on. Of course this tremendous computational power still requires learning and standard silicon based devices, we want to be clear about this. But once trained and optimized, the system could provide capabilities that nowadays are out of reach. In the next section, we will envisage future architectures. 14.3. Colloidal Processors We would like to provide a spectrum of possible architectures, getting rid of any application domain, not to limit our imagination. In fact,

page 436

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Towards Colloidal Droplet Processors

b4205-v2-ch14

437

architectures are enabling tools that could be used to address a specific problem, and specific problems are the outcomes of the age when we are living, and feature a high volatility! 14.3.1. Phase change architectures Computing in a droplet is per se an established tool. But what if we let the computing matter change aggregation state? In a first hypothesis, we imagine letting the temperature rise in a controlled way and create a partial pressure of computing matter vapour stabilized in dynamic equilibrium with the liquid one. Now, gaseous species have diffusion and mixing properties different from the liquid ones, so that doping could be implemented in quite a fast way, with the interesting feature of acting homogeneously through the droplet surface, rather than locally via diffusion from the injection point. In another hypothesis, we let the temperature fall in a controlled way and create nuclei of solid computation matter dispersed in the droplet. Depending on the diffusion of species composing the computational matter, we might expel some molecules from formatting crystals and therefore increase their concentration in the remaining liquid. Or vice versa, if solubility is higher in the solid than in the liquid. Crystalline nuclei will provide another source of diffraction for incoming waves (regardless of their nature, acoustic, optical or electromagnetic). It is even possible to create the controlled conditions for coexistence of liquid, solid and vapour phase in equilibrium, therefore enabling playing with doping, processing and diffracting information (Figure 14.3). The colloid processors can be formally represented by Voronoi diagrams. We can abstract colloids as planar Voronoi diagrams, points of which are colloidal particles (Figure 14.4). Efficiency of Voronoi diagram representation is proven in arrangements of discs and sphere packing,14–16 structural analysis of liquids and gases,17 protein structure,18 and inter-atomic bonds.19 Let P be a nonempty finite set of planar points. A planar Voronoi diagram31 of the set P is a partition of the plane into such regions, that for any element of P, a region corresponding to a unique point p contains all those points of the plane which are closer to p than

page 437

August 3, 2021

438

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch14

A. Chiolerio and A. Adamatzky

Figure 14.3. Envisaged architectures for in materio computing, sketched in an artistic vision. From top to bottom, from left to right: phase change system composed by an evaporating droplet; phase change system composed by a crystallizing droplet; microfluidic system connecting two computing droplets; granular system supporting the propagation of a generic wave; foam system supporting density gradient waves; geomagnetic field where we have evidenced magnetic induction lines (green) and van Allen belts hosting high energy ions (intercepted by surfaces, color code proportional to particle density).

page 438

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Towards Colloidal Droplet Processors

b4205-v2-ch14

439

Figure 14.4. Example of Voronoi tesselation with second-order neighborhood structures highlighted for four cells. In each case central cell p is filled with blue (black in gray-scale reproduction) color, first-order neighbors of w(p) are filled with red (dark-gray) color, and neighbors of w(p) are filled with green (lightgray) color. For each blue (black) Voronoi cell a second-order neighborhood is a set of red (dark-gray) and green (light-gray) Voronoi cells.

to any other node of P. A unique region vor(p) = {z ∈ R2 : d(p, z) < d(p, m)∀m ∈ R2 , m = z} assigned to point p is called a Voronoi cell of the point p.32 The boundary ∂vor(p) of the Voronoi cell of a point p is built of segments of bisectors separating pairs of geographically closest points of the given planar set P. A union of V D(P) = ∪p∈P ∂vor(p) all boundaries of the Voronoi cells determines the planar Voronoi diagram. A Voronoi automaton is a tuple V = V(P), Q, N, u, f , where P is a finite planar set, V(P) = {V (p) : p ∈ P}, Q is finite set, N is a set of natural numbers and u : V(P) → V(P)k is second-order neighborhood, 0 < k < |P|, and f : Qk → Q is a cell-state transition function. Cell state set has four elements, Q = {◦, +, −, #}. Thus we assign three excitation-related states — resting (◦), excited (+), and refractory (−) — to cells, and one

page 439

August 3, 2021

440

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch14

A. Chiolerio and A. Adamatzky

precipitate state #. Cells update their state in discrete time. A state of cell V (x) are time step t ∈ N is denoted as V (x)t . All cells update their states in parallel using the same cell-state transition function. Let w(p), p ∈ P, be a first-order neighborhood of a Voronoi cell V or(p) from V(P): w(p) = {V (q) ∈ V(P) : ∂V (p) ∩ ∂V (q)}, that is, a set of Voronoi cells which have common edges with V (x). A secondorder neighborhood u(p) is a set of neighbors of first-order neighbors of V (p): u(p) = V (q)∈w(V (p)) w(V (q)). Examples of Voronoi cell neighborhoods are shown in Figure 14.4. Transition form excited to refractory state is unconditional, that is, takes place with regards to states of a cell’s neighbors. A resting cell excites if it has at least one excited neighbor. We take a refractory state and precipitate states are absorbing: once a cell takes either of these two states it does not update its state any longer. Neighborhood sizes may differ between Voronoi cells therefore we use a state transition function, where a cell updates its state depending on a relative excitaton in its neighborhood. We assume that precipitation occurs in a resting cell when a ratio of excited neighbors to a total number of neighbors exceeds some threshold η ∈ [0, 1]. Let σ(V (x)t ) be a number of excited cells in the cell V (x)’s second-order neighborhood, σ(V (x)t ) = V (y)∈u(V (x)) |{V (y) : V (y)t = +}| then a cell updates its state by the following rule: ⎧ #, if V (x)t = ◦ and σ t (x)/ν(x) > η ⎪ ⎪ ⎪ ⎪ ⎪ ⎨+, if V (x)t = ◦ and σ t (x)/ν(x) > 1 t+1 = V (x) ⎪ −, if V (x)t = + ⎪ ⎪ ⎪ ⎪ ⎩ ◦, otherwise. While detailed analysis of Voronoi automata is provided in Ref. [20] here we provide an example of approximating a Voronoi diagram on Voronoi automaton. Let B be a set of planar points on which a Voronoi diagram V on V, η = 0.4, is approximated. We project B onto V and excite cells of V which are closer than 9 units to points of B. Excitation

page 440

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch14

Towards Colloidal Droplet Processors

(a) t = 1

(b) t = 12

(d) t = 44

441

(c) t = 20

(e)

Figure 14.5. Approximation of Voronoi diagram V(B) on Voronoi automaton V, η = 0.4. Points of B are projected onto V at time step t = 0. Seven Voronoi cells of V are excited and generated quasi-circular excitation waves. Configuration of V at time step t = 1, when wave have just started to develop, is shown in (a). The waves propagate outwards their initial stimulation sites and covert Voronoi cells they are occupying into refractory states (b)–(d). When two or more waves collide precipitation occurs. By 44th step of the automaton development excitation extincts but domains of precipitate-cells represent edges of the approximated Voronoi diagram V. Resting cells are blank, excited cells red (dark-gray), refractory cells gray, and precipitate cells black. (e) Schematic representation of data objects and their Voronoi diagram or skeleton for precipitation-based approximation.

waves spread on V. The excitation waves collide and precipitation occurs nearby sites of the waves’ collisions. Configuration of cells in precipitate state represents edges of V(B). An example of excitation– precipitation dynamics in V which approximates V(B), where B is a planar set of seven points, is shown in Figure 14.5. A scheme of the Voronoi diagram computed is shown in Figure 14.5(e).

page 441

August 3, 2021

442

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch14

A. Chiolerio and A. Adamatzky

14.3.2. Microfluidic circuits Computing droplets could interact by several means. Besides obvious integration into a computing scheme where each droplet performs over a set of trained input–output doors, which can interact by standard electronic means, we can imagine putting different droplets in chemical communication by pumping a small quantity of computing matter into a microfluidic channel connecting say computing droplet A with computing droplet B. This operation has a significance for those cases where interaction with acoustic, optical, or electromagnetic fields produces a (reversible) change in some properties of the computing matter. This means we will be able to bring outside of equilibrium one medium without the need to expose this specific medium to the input that creates that response. Such a solution, implemented on a random array of channels, was shown to enable pressure sensing and computing capabilities.21 14.3.3. Layered shells The shape of the computing element is important to define the connectivity degree, acting on hidden variables, and ultimately influencing the application that maps low dimensional input/output spaces on high dimensional inner spaces. A pseudospherical droplet has a certain degree of connection, much different from a system composed by a bidimensional array of droplets connected along horizontal and vertical lines by say laser beams, much different from a sandwich of layered shell elements, of any curvature. The distance between such shells can ultimately affect the divergence of diffracting light among layers, and therefore tune the number of agents influenced by the specific single agent pertaining to the previous layer. A thin liquid shell can implement what is known as morphological computation,22 like spider webs.23 Liquid shells curvature could be modified by active mechanical systems, or by passive ones, as a function of external gravitational field, speed, acceleration, etc.

page 442

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Towards Colloidal Droplet Processors

b4205-v2-ch14

443

14.3.4. Granular media, foams, and plasmas Without the intention of pushing acoustic applications at the expense of optical ones, we also consider that granular media can be used to implement a similar purpose.24 We have seen as randomly distributed size-spherical glass spheres can be used to induce random lasing thanks to Levy flights, so that granular media offer interesting properties when integrated in a computing device. The void space among solid particles can be infiltrated with a solid material of different nature, or by a liquid, or by a gaseous substance. Rocks and ground, infiltrated by air in their topmost layers, by water in their fluidized layers, eventually by crude oil or natural gas in their more deep layers, can also be seen as a computational medium, subjected to seismic waves propagation and diffraction, in a huge natural computer that is called planet Earth. Similarly, considering foams as a boundary layer of liquid or solid nature that encase gases creating compartments, we also propose the use of such scatterers that can be easily stabilized by using proper surfactants. Even electric fields can be used to stabilize nanobubbles of air in water, so that a substance apparently liquid can be better defined a foam.25 Our causal interpretation of facts is that rough sea, blown by winds, is the natural consequence of a physical force that mixes up seawater with air and creates foam, waves, currents, turbulence, saltiness transported by the same winds, floating sand. Another equivalent vision is that the natural environment acts as an immense processor, and any change we might notice is simply due to processing of information carried by incoming waves, and brought away till the horizon by outgoing waves. A similar medium that could be used to perform computations is plasma, easily formed both in vacuum conditions as well as at atmospheric pressures, by proper electric fields. Plasma can interact with incoming acoustic, optical, or electromagnetic waves and diffract them, providing similar means in comparison to what we already studied. Evaporation and condensation might not be possible, nevertheless a plasma could be doped by introducing specific gaseous species, or could be

page 443

August 3, 2021

444

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch14

A. Chiolerio and A. Adamatzky

transported by means of MicroWaves (MW), slowly varying magnetic fields, slowly varying electric fields. Also in the case of plasma we might trace a natural parallelism, offered by the plasma bubbles that protect our planet from outer radiation (the ionosphere and the van Allen belts), subject to solar wind perturbations.26 The effect of the solar wind electric pressure, pushing on our magnetosphere, is able to change the shape of the van Allen belts, extension, local curvature, eventually also their number! The auroras therefore acquire a new meaning, that of natural states of our magnetospheric computing medium. 14.4. Biological Processors Making reference to Figure 14.2, we asked ourselves if this particular architecture has been developed by natural evolution in living organisms, or not. The analogy with the eye is quite evident: incoming electromagnetic waves are collected, focused, and sent back to the inner chamber containing vitreous humor, which is the equivalent of the colloidal processor; the humor itself provides mechanical pressure to keep in place the retina, which is equivalent to the CCD device there depicted and provides a first layer for computation (Figure 14.6(a)). Humor composition includes mostly phagocytes and hyalocytes, 98–99% of water as solvent, salts, sugars, vitrosin, a network of collagen type II fibrils with glycosaminoglycan, hyaluronan, opticin, and a wide array of proteins.28 Interestingly, the vitreous fluid is not present at birth, the eye being filled with only the gel-like vitreous body until the age of 4 or 5. At that age the representation of the outer world in standard subjects is complete, and such representation is always primarily based on visual imaging, visual pattern recognition and representation of the volumes, movements, geometrical features, and so on, for non-blind individuals. We might therefore have evidenced the fundamental role of vitreous humor, beyond the well-known aspects of pressure balance and refractive index matching. Are there other structures that could have a similar role? Without willingness of being exhaustive, we recall the melon and junk found in cetaceans (Figure 14.6(b)), large volumes localized in the nose of such mammalians that are filled

page 444

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Towards Colloidal Droplet Processors

b4205-v2-ch14

445

(a)

(b)

(c)

(d)

(f)

(e)

(g)

Figure 14.6. (a) Sketch showing the inner structure of a human eye: 1) aqueous humor 2) iris 3) lens 4) collagen strands 5) glycosaminoglycans 6) hyaluronan components 7) Cloquet’s canal 8) retina 9) optical nerve. (b-g) Artistic representation of three-dimensional reconstructions of melon morphology from different odontocete species. Depicted structures represent the skin (gray tones), the skull (white), and melon, bursae complexes, spermaceti organ and fat bodies (orange). The sketch was inspired by Ref. [27]. b) Kogia breviceps; c) Ziphius cavirostris; d) Pontoporia blainvillei; e) Phocœna phocœna; f) Tursiops truncatus g) Lagenorhyncus obliquidens.

with organic compounds like triglycerides and wax esters and are responsible for focusing and modulating the sound waves used to perform echolocation.27 Similarly the cushion pads of elephant’s feet are used to communicate through seismic waves, and are selectively activated by the “freezing behavior” of the animal, that leans forward compressing the front cushions, and by filtering higher frequency waves detected by the ears using a unique sphincter-like muscle that

page 445

August 3, 2021

446

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch14

A. Chiolerio and A. Adamatzky

reduces the diameter of the ear duct. Further studies should be carried on, by implementing this particular interpretation scheme to elucidate if such a role can be confirmed by experiments. Another example of a biological colloidal processor with phase changing architecture (see Section 14.3.1) is a cytoskeleton network of actin filaments and tubuline microtubules. We focus here on actin only. Actin is a crucial protein of all eukaryotic cells. It is present in forms of monomeric, globular actin (G-actin) and filamentous actin (F-actin). A volume with G-actin is seen as a ‘pure’ form of colloid. Units of G-actin can undergo electro-magnetic and mechanical excitation, which in turn can be transmitted through cytoplasm between the G-actin units. The communication in such case is one-to-many, possible determined by a neighborhood radius when electromagnetic or mechanical vibration waves cease to propagate. Under the appropriate conditions, G-actin polymerizes into F-actin forming a double helical structure. The acting filaments form bundles. The bundles are conducive to signals. The signals could be represented by travelling localizations: defects, ionic waves, solitons. The travelling localizations can implement computation. Actin bundles networks residing inside living cells are difficult to control and therefore it is preferable to prototype synthetic phasetransition colloid processors made of actin bundle networks in vitro. This can be implemented as follows. In 1976 Deamer and Bangham produced 0.13 μm (large by all means) single-wall liposomes: ether solution with lipids dissolved is injected into warm water.29 Their methods was used by Cortese et al.30 to encapsulate actin inside the vesicles. They shown that actin inside the lipid vesicles undergo polymerization from discrete sites of nucleation. Moreover, the actin changes shape of the vesicles during its polymerization. Further work33 demonstrated actin polymerization, triggered by K+ or Mg−2 ions, in dimyristoylphosphatidylcholine vesicles and mechanical interaction of actin filament with the membrane. These findings bring in potential designs of actin-based colloid processor — recursively embedded vesicles (Figure 14.7(a)), which can be further arranged in 2D or 3D arrays to make a massive-parallel phasetransition colloid processors (Figure 14.7(b)).

page 446

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Towards Colloidal Droplet Processors

(a)

b4205-v2-ch14

447

(b)

Figure 14.7. Biological processor with phase transition architecture. (a) Vesicles with actin bundles network embedded in a larger vesicle. (b) Actin vesicle massive parallel processor.

14.5. Conclusions Colloids possess a range of unique properties which make them suitable for implementing massive parallel computation. They are stable, that is, the particles remain suspended in the solution — thus, it is possible addressing any single particle to access a memory cell of the colloid processor. The colloids are not affected by ordinary filters and therefore give us a robust computing matter. The colloids are heterogeneous — the dispersed phase can be used as a computing matter and the dispersed media as a conductive or communication substrate. Colloids reflect light (Tyndall effect), refract light — this offers an opportunity for optical computation with colloid droplets. Electrical properties of the colloids allow for manipulation of the dispersed phase, and thus, a “re-wiring” of the computing architectures in the colloid droplet processors. In addition to optical interfaces, one can also establish electromagnetic interface exploiting electrophoresis. Combinations of electrophoresis and electro-osmosis allow for precise re-configuring of the colloid based computing architectures. We have discussed various architectures and potential implementation of colloid droplet processors, as well as possible biological colloidal processors. Our future studies will be concerned with experimental laboratory implementations of prototypes of the colloid computing devices.

page 447

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

448

b4205-v2-ch14

A. Chiolerio and A. Adamatzky

Acknowledgment This project has received funding from the European Union’s Horizon 2020 research and innovation programme FET OPEN “Challenging current thinking” under grant agreement No. 964388 — COgITOR.

References 1. A. Chiolerio, The fourth order cybernetics. Adv. Intellt. Syst. (2020). ISSN 2640–4567. 2. B. Yellin, Saving the future of moore’s law. Dell technologies white paper https://education.dellemc.com/content/dam/dell-emc/documents/en-us/ 2019KS Yellin-Saving The Future of Moores Law.pdf. 3. K. Jahani, M. Mohammadi, M. Shafii, and Z. Shiee, J. Electron. Packag. 135, 2 (2013). 4. A. Chiolerio, Spintronic Devices. PhD thesis, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Torino, Italy (2009). 5. X. Sun, X. Wang, B. Yuan, and J. Liu, Mater. Today Phys. 14, 100245 (2020). 6. W. Maas, Lecture Notes in Computer Science. 4497, 507–516 (2007). 7. E. Hourdakis and P. Trahanias, Neurocomputing 107, 40 (2013). 8. M. L. Alomar Barcel` o, Methodologies for hardware implementation of reservoir computing systems, PhD Thesis, Universitat de les Illes Baleares, Corso Duca degli Abruzzi 24, 10129 Torino, Italy (2017). 9. V. Canals, A. Morro, A. Oliver, M. L. Alomar, and J. L. Rossello, IEEE Trans. Neural Networks Learn. Syst. 27, 551 (2016). 10. J. Liu, L. Sheng, and Z.-Z. He, Liquid Metal Soft Machines, 1st edn. (Springer, 2019). 11. K. Pribram, Brain and Perception: Holonomy and Structure in Figural Processing, 1st edn. (Lawrence Erlbaum Associates Inc., 1991). 12. G. Marcucci, D. Pierangeli, and C. Conti, Theory of neuromorphic computing by waves: Machine learning by rogue waves, dispersive shocks, and solitons. Phys. Rev. Lett. 125, 093901 (2020). 13. D. Pierangeli, V. Palmieri, G. Marcucci, C. Moriconi, G. Perini, M. De Spirito, M. Papi, and C. Conti, Living optical random neural network with three dimensional tumor spheroids for cancer morphodynamics. Commun. Phys. 3, 160 (2020). 14. K. Lochmann, L. Oger, and D. Stoyan, Statistical analysis of random sphere packings with variable radius distribution. Solid State Sciences 8(12), 1397– 1413 (2006). 15. G. Filatovs, Delaunay-subgraph statistics of disc packings. Maters. Characteriz. 40(1), 27–35 (1998). 16. V. Luchnikov, M. L. Gavrilova, N. N. Medvedev, and V. Voloshin, The voronoi–delaunay approach for the free volume analysis of a packing of balls in a cylindrical container. Future Gen. Comput. Systs. 18(5), 673–679 (2002).

page 448

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Towards Colloidal Droplet Processors

b4205-v2-ch14

449

17. A. V. Anikeenko, M. Alinchenko, V. Voloshin, N. N. Medvedev, M. L. Gavrilova, and P. Jedlovszky, Implementation of the voronoi-delaunay method for analysis of intermolecular voids. In International Conference on Computational Science and Its Applications (2004), pp. 217–226. 18. J. Bernauer, R. P. Bahadur, F. Rodier, J. Janin, and A. Poupon, Dimovo: A voronoi tessellation-based method for discriminating crystallographic and biological protein–protein interactions. Bioinformatics 24(5), 652–658 (2008). 19. L. W. Hobbs, Network topology in aperiodic networks. J. Non-crystall. Solids. 192, 79–91 (1995). 20. A. Adamatzky, B. D. L. Costello, J. Holley, J. Gorecki, and L. Bull, Vesicle computers: Approximating a voronoi diagram using voronoi automata. Chaos, Solitons & Fractals 44(7), 480–489 (2011). 21. A. Chiolerio and A. Adamatzky, Flexible Printed Electron. 5, 025006 (2020). 22. S. M. Hadi Sadati and T. Williams, Biomimetic and Biohybrid Systems, 1st edn. (Springer, 2018). 23. H. Elettro, S. Neukirch, F. Vollrath, and A. Antkowiak, Proc. Natl. Acad. Sci. 113, 6143 (2016). 24. H. Jaeger and S. R. Nagel, Granular solids, liquids and gases. Rev. Modern Phys. 68(4), 1259–1273 (1996). 25. M. R. Ghaani, P. G. Kusalik, and N. J. English, Sci. Adv. 6, eaaz0094 (2020). 26. E. Adams, K. Fretz, A. Ukhorskiy, and N. Fox, 173van allen probes mission overview and discoveries to date, Johns Hopkins APL Technical Digest. 33(3), 173–182 (2016). 27. M. F. McKenna, T. W. Cranford, and N. D. Pyenson, Marine Mammals Sci. 28(4), 690–713 (2012). 28. J. Mains, L. Ean Tan, T. Zhang, L. Young, R. Shi, and C. Wilson, Invest. Ophtal. Visual Sci. 53(8), 4778–4786 (2012). 29. D. Deamer and A. Bangham, Large volume liposomes by an ether vaporization method. Biochimica et Biophysica Acta (BBA)-Biomembranes 443(3), 629–634 (1976). 30. J. D. Cortese, B. Schwab, C. Frieden, and E. L. Elson, Actin polymerization induces a shape change in actin-containing vesicles. Proc. Natl. Acad. Sci. 86(15), 5773–5777 (1989). 31. G. Voronoi, Nouvelles applications des paramètres continus ` a la théorie des formes quadratiques. Premier mémoire. Sur quelques propriétés des formes quadratiques positives parfaites. Journal f¨ ur die reine und angewandte Mathematik (Crelles Journal), 1908(133), 97–102 (1908). 32. F. P. Preparata and M. I. Shamos, Intersections. In Computational Geometry (Springer, 1985), pp. 266–322. 33. M. B¨ armann, M. J. K¨ as, H. Kurzmeier, and E. Sackmann, A new cell model — Actin networks encaged by giant vesicles. In The Structure and Conformation of Amphiphilic Membranes (Springer, 1992), pp. 137–143.

page 449

B1948

Governing Asia

This page intentionally left blank

B1948_1-Aoki.indd 6

9/22/2014 4:24:57 PM

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch15

c 2021 World Scientific Publishing Company https://doi.org/10.1142/9789811235740 0015

Chapter 15

Biomolecular Motor-based Computing Arif Md. Rashedul Kabir and Akira Kakugo∗ Faculty of Science, Department of Chemistry, Hokkaido University, Sapporo 060-0810, Japan ∗ [email protected] Biomolecular motors are the smallest natural machines that play crucial roles in many physiological functions in living organisms. Due to their outstanding ability to convert chemical energy into mechanical work with remarkably high energy efficiencies, biomolecular motors have been finding various applications in artificial environments. In this chapter, we introduce the utility of biomolecular motors in molecular computing as one of the prominent examples of unconventional computing. We discuss how the self-propelling ability of biomolecular motors serves as the key to their utilization in such computing. Particularly, we focus on DNA-mediated swarming of biomolecular motors that has recently opened a new door to harness emergent functions of molecular robots for exploiting in molecular computation in a robust and parallel manner. All these recent developments are expected to further advance the unconventional computing to a new paradigm.

15.1. Introduction The biomolecular motor systems actin-myosin, microtubule-kinesin, and microtubule-dynein are natural machineries that keep the living organism dynamic.1 These natural machineries play vital roles in a number of cellular activities, such as cell motility, cell division, intracellular transport, etc. which are important for living organisms.1, 2 Biomolecular motors can convert chemical energy into mechanical work, with remarkably high energy efficiency and specific

451

page 451

August 3, 2021

452

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch15

A. M. R. Kabir and A. Kakugo

power compared to manmade machineries. Their small size, natural prevalence, and negligible cost for reconstruction have made them promising candidates for various applications such as nanoscale transport, sensing, sorting, surface imaging, force measurement and so on.3, 4 Active self-assembly of biomolecular motors in in vitro conditions plays an important role for employing them in various applications.3, 5 In recent years, biomolecular motors have attracted much attention in molecular robotics.6 Advanced technology for manipulation of molecules and fusion of supramolecular chemistry, nanotechnology, chemical engineering, biomolecular engineering, etc. have enabled a great progress in order to utilize biomolecular motors in unconventional computing, particularly in parallel computing which are important in solving problems of combinatorial nature.7 Despite the utility of electronic computers in performing large number of operations with high speed, parallel operation for accomplishing combinatorial tasks is still a big challenge. On the other hand, although DNA computation, quantum computation, or microfluidicsbased computation have been promising, their applications have been restricted by respective drawbacks. The DNA-based computation that provides mathematical solutions through hybridization of DNA strands or formation of DNA nanostructures requires large amount of DNA.8–10 Decoherence and small number of quantum bits that can be integrated limits the application of quantum computation.11 Microfluidics-based parallel computation suffers from the problem with diverging physical size and complexity of the devices with the size of the problem to be solved.12 Biomolecular motor systems may offer solutions to these drawbacks of conventional computing due to several advantages. Biomolecular motor systems are self-propelled, where the protein filaments actin and microtubules are driven by their associated motors, that is, myosin and kinesin/dynein, respectively (Figure 15.1). Therefore, any external driving force is not required to operate biomolecular motor-driven protein filaments. Rapid movement of motor driven protein filaments can enhance computational speed and minimize the required time. Consistent

page 452

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Biomolecular Motor-based Computing

b4205-v2-ch15

453

(a)

(b)

(c)

Figure 15.1. Schematic diagrams show propulsion of a microtubule and actin filament by kinesin or dynein (a or b), and myosin (c), respectively; reproduced with permission from Ref. [5].

motion of protein filaments in the forward direction can be helpful in keeping the errors low. A large number of motors can work independently to confirm parallel computation. A large number of motor-driven protein filaments can be employed in a highly efficient manner, that also helps avoid problems related to power consumption and heating associated with electronic computers. In the following sections, we will discuss how biomolecular motor systems can be utilized to overcome the problems with conventional computing. We will also introduce how molecular robots, prepared from biomolecular motors, have moved the current state of unconventional computing one step forward. Finally, we shed light on the challenges that must be overcome in order to utilize biomolecular motor-based swarm

page 453

August 3, 2021

454

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch15

A. M. R. Kabir and A. Kakugo

robots for solving mathematical problems of practical importance in a sustainable manner. 15.2. Applications of Biomolecular Motors in Computing 15.2.1. Parallel computing using biomolecular motors in nanofabricated networks One of the prominent examples of unconventional computing using biomolecular motors has been demonstrated in Ref. [7]. A combinatorial problem was encoded into the geometry of a physical network of lithographically defined channels in a planar device. A large number of actin and microtubule filaments, driven by myosin and kinesin motor, respectively, were used as agents to explore the network and solve a classical non-deterministic-polynomial-time complete problem (Figure 15.2). The unidirectional motion of the filaments was equivalent to elementary operations of addition, whereas their positions were equivalent to “running sums”. Computation was initiated by applying the actin or microtubules to the loading zone of a lithographically-fabricated, motor functionalized network in the presence of fuel (adenosine 5 -triphosphate, ATP). The filaments, after traversing the network, appeared at exits corresponding to target sums. Both the actin-myosin and microtubule-kinesin systems were found effective in solving a combinatorial problem using parallel computation. Both types of filaments were able to find all correct results and more filaments exited nodes corresponding to correct results than incorrect results. This biomolecular motor-based computing device consumes much less energy compared to the electronic and micro-fluidic computers. However, the following technical advancements are necessary to use this device to challenge an electronic computer: scaling up of the physical network size, reduction of agent feeding time, tracking large number of filaments, programmability, control on filament attachment to and detachment from the network. Moreover, increasing the speed of operation of the device compared to an electronic computer should also be a future target.13

page 454

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Biomolecular Motor-based Computing

b4205-v2-ch15

455

(a)

(b)

(c)

Figure 15.2. (a) Solving a subset sum problem by myosin-driven actin filaments (left) and kinesin-driven microtubule filaments (right). (b) Experimental results obtained from actin (left) and microtubule filaments (right). (c) Monte Carlo simulation results for actin filaments (left) and microtubules (right). In (a)–(c), green numbers and bars represent correct results, and magenta numbers and bars represent incorrect results; reproduced with permission from Ref. [7].

15.2.2. Computing using swarm robots prepared from biomolecular motors Compared to the computation performed using biomolecular motordriven single protein filaments, as discussed above, computation

page 455

August 3, 2021

456

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch15

A. M. R. Kabir and A. Kakugo

based on swarms of biomolecular motor-driven protein filaments have attracted much attention in recent years. The application of swarms in computing stem from fascinating coordinated behavior of living organisms such as birds, fish, cells, and bacteria as observed in nature.14, 15 The swarms of living organisms display an outstanding ability to process information received through interactions with nearby individuals and cooperatively change their organization, for example, position, shape or size of the groups, in the absence of any leader. Swarming provides living organisms with several emergent functions such as parallelism, robustness, and flexibility which are unachievable by a single entity.16 “Parallelism” enables sharing of tasks by creating “groups”, “robustness” helps to ensure that tasks are executed properly, and “flexibility” permits the swarms to respond to changes in their environments. Such attractive features were the motivations behind constructing swarm robots from selfpropelled biomolecular motors17 and employing the swarms in unconventional computing. Owing to sensing, actuating, and information processing ability of the swarms, swarm robots have opened a new dimension in unconventional computing. 15.2.2.1. Design and construction of molecular robots Recent progress in molecular robotics has been granted by virtue of fusion of biotechnology, chemical engineering, supramolecular chemistry, nanotechnology, etc. which have also greatly benefited the creation of swarm robots using biomolecular motors.17 A molecular robot is an integrated system formed through combination of different molecular devices that can work as processor, sensor, and actuator. As a molecular processor, various DNA/RNA-based nanostructures have been developed that can sense a variety of signals and process the information to output signals for other parts of the robots.18–20 Photo-responsive molecules have been ideal candidates for providing sensing ability to the robots.21 Biomolecular motor proteins have been considered the best candidate as actuators for molecular robots. By integrating these molecular elements in a bottom-up manner a molecular robot was recently developed.6 The biomolecular motor system microtubule-kinesin was employed for

page 456

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Biomolecular Motor-based Computing

b4205-v2-ch15

457

(a)

(b)

(c)

(d)

Figure 15.3. Preparation of molecular robots from microtubules and swarming of the molecular robots. (a) Microtubules are conjugated with single-strand DNA with complementary sequence. Azobenzene, a photo-responsive molecule, is inserted in the DNA to facilitate photo-regulation of hybridization of the complementary sequences. (b) Microtubules, gliding on a kinesin coated substrate, form swarm through self-assembly due to hybridization of complementary DNA sequences. (c) Pattern of swarm can be tuned by changing the mechanical properties of microtubules. (d) Photo-regulated hybridization of DNA strands that consequently permits photo-regulation of the swarming of microtubules. Scale bar: 20 µm; reproduced with permission from Refs. [6, 17].

fabricating the molecular robot (Figure 15.3). DNA has been used as the information processor, and photo-sensitive molecules such as azobenzene played the role of a sensor for the molecular robots. The basic unit of the molecular robot was prepared by conjugating single strand DNA to microtubules (Figure 15.3). Furthermore, photoresponsiveness was achieved by introducing azobenzene into the DNA. The activity of the prepared molecular robots was evaluated from their dynamics on a kinesin coated substrate. In the presence of chemical energy obtained from ATP, the molecular robots exhibited gliding motion with an average velocity of ∼600 nm/s, which is very close to the velocity of microtubules without DNA conjugation. Almost 85% of kinetic characteristics of microtubules was retained despite conjugation of DNA.

page 457

August 3, 2021

458

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch15

A. M. R. Kabir and A. Kakugo

15.2.2.2. Swarming of molecular robots Local interactions among the individuals play an important role in swarming of self-propelled objects. Swarming of molecular robots was realized by utilizing the molecular recognition ability of DNA in controlling the local interactions between motile molecular robots (Figure 15.3). Using an association DNA as an input signal, a large number of molecular robots gliding on a kinesin-coated substrate were allowed to form swarms. The input DNA strand was designed such that it mediated attractive interaction among the motile molecular robots. In a swarm of molecular robots, all individuals moved in the same direction which is determined by the polarity of the microtubules. DNA-based computation was used not only for swarm formation, but also for dissociation of the swarms into solitary molecular robots.6 The input signal of dissociation DNA prompted the groups of molecular robots to separate into single robots through strand displacement reaction of the complementary DNA. Thus, controlled swarming of the molecular robots prepared from biomolecular motors, using DNA-based molecular recognition, endorses the potentials of biomolecular motors and their swarms in unconventional computation. 15.2.2.3. Controlling the shape morphology of swarms of molecular robots The morphology of the swarms of molecular robots was regulated by tuning the rigidity of microtubules that were the basic units of molecular robots (Figure 15.3). For instance, a molecular robot synthesized from microtubules having relatively high bending stiffness of 62 × 10−24 Nm2 formed linear bundle-shaped swarms and exhibited translational motion. On the other hand, when the stiffness of microtubules was reduced by changing their polymerization conditions, circular swarms were formed that exhibited rotational motion (Figure 15.3). The rigidity of microtubules can be controlled not only by changing their polymerization conditions, but also using some stabilizers such as taxol or microtubule-associated proteins.22 Changes in morphology of the swarms of molecular robots are also found to affect the path-persistent length of the robots.6

page 458

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch15

Biomolecular Motor-based Computing

459

15.2.2.4. Remote control of the swarming of molecular robots By incorporating a photo-sensitive molecule (azobenzene) into DNA strands, the swarm formation by molecular robots and dissociation of the swarms into individual robots were controlled simply using light (Figure 15.3). Under ultraviolet light irradiation (λ = 365 nm), the azobenzene was transformed into the cis-state, and the DNA computing element was transformed to the “OFF” state resulting in no swarm formation by the molecular robots. When visible light (λ = 480 nm) was applied, the azobenzene was transformed into the trans-state, and the DNA computing element was turned “ON” that allowed formation of swarms of robots. By adjusting the physical properties of the molecular robots, it was also possible to control the behavior of the swarms groups by light, such as translation and rotation, simultaneously with the formation and dissociation of the swarms.6 These results imply that biomolecular motor-based computation can be remote-controlled simply using light and photoresponsive DNA. 15.2.2.5. Logical operation of molecular robots The ability of DNA to store incredible amounts of data allows to perform parallel computations. Utilizing DNA as a logic operator in molecular computing, different logic operations such as YES, AND, OR gate etc. were demonstrated by the molecular robots.6 In those logic operations the output was confirmed by swarming of molecular robots that was regulated by suitable input DNA signals (Figure 15.4). The YES logic gate was realized by using an input DNA signal which facilitated swarming the molecular robots already equipped with DNA signals complementary to the input DNA signal. The AND logic gate was demonstrated by designing two different input DNA signals, which were partially complementary to the two DNA signals carried by two groups of molecular robots. Swarming of the molecular robots was observed as the output only when both the input DNA signals were present. The OR logic gate was operated by simultaneous operation of two swarming groups. In each group, two types of molecular robots were equipped with two different DNA signals. The two types of robots swarmed independently when

page 459

August 3, 2021

460

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch15

A. M. R. Kabir and A. Kakugo

Figure 15.4. Design of logic gates constructed using molecular robots prepared from microtubules. For the YES gate, a suitable DNA signal (DNA-1) was inputted into the system and swarming was obtained as the output signal (1 to 1). For the AND gate, DNA-2 and DNA-3 were necessary to be present to obtain swarming. For the OR gate, the presence of either DNA-1 or DNA-4 was enough to obtain swarming; reproduced with permission from Refs. [6, 17].

another input DNA signal, partially complementary to the DNA carried by the robots, was available. Both the swarm groups were operated in a concerted fashion when both the input DNA signals were available. In all these logical operations, the output can be confirmed visually from a change in the morphology of the swarm of the molecular robots, or from change in color of molecular robots. Association ratios were 85–100% for all the systems corresponding to the output as swarming, which are significantly higher than those for the outputs in which swarming was not realized through logical operations ( 0. For the sake of completeness, we carried out the same study for the parallel crossing configuration from Figure 16.6(a) by again considering a TEM square pulse being excited from port 1 with a duration of 0.5 ns. In this case, the positive/zero voltage in port one is applied to the top/bottom metals of the parallel plate waveguides (see Figure 16.6(a)). With this configuration, the numerical results of the in-plane H -field distribution at the same times as in Figure 16.5

(a)

(b)

Figure 16.6. Parallel crossing with single-port excitation. (a) Schematic representation. (b) Numerical simulation results of the in-plane magnetic field (Hx,z ) at different times for: (left) incident pulse (Hx ) before it has reached the crossing point, (center) Hx component of the E -field after the square pulse has passed the crossing point and (right) same as (center) but showing the Hz component of the H -field. Adapted and reproduced in part from Ref. [35] with permission of Wiley.

page 477

August 3, 2021

478

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch16

V. Pacheco-Pe˜ na and A. Yakovlev

(before and after the incident TEM square pulse passes the crossing point) for the parallel junction are shown in Figure 16.6(b). As it can be seen, the incident pulse (from port 1) travels towards the crossing point along the z-axis at t = 1.8 ns (left panel from Figure 16.6(b)). At t = 2.4 ns (center and right panels from Figure 16.6) the TEM square pulse has passed the crossing point producing four square pulses which, as in the series crossing, travel towards each port 1–4. Interestingly, note how the amplitude of the Hx -field for the pulse towards port 1 has the same sign (negative) as the incident pulse, that is, the out-of-plane Ey -field is now directed in the opposite direction to that of the incident signal (meaning that now the voltage in port 1 will be zero/positive at the top/bottom metals). This is an expected result as the reflection coefficient for the parallel crossing is ρ < 0, as described before. 16.3.3. Scattering matrix approach Let us now mathematically define the general scenario where several pulses arrive at the crossing point from different sections at the same time. In this context, we can simply apply the superposition of TEM square pulses similarly as the well-known Huygens’s and the TL matrix (TLM) method.39–41 As we are dealing with N number of ports in the system, an efficient way to define such superposition of pulses is by defining the whole system as a scattering matrix y = AxT , with y and x as the output and input vectors, respectively, representing the N ports (see Section 16.3.1 and Ref. [35]), A = ± (I − γJ ) for the series and parallel crossing, respectively, and I and J are the N × N identity and all-ones matrices, respectively. If we map the input/output signals in A as columns/rows, respectively, we can define A for the series and parallel crossing cases, respectively, as follows: ⎤ ⎡ N −2 ∓ N2 ∓ N2 ∓ N2 ± N ⎥ ⎢ ⎢ ∓2 ± NN−2 ∓ N2 ∓ N2 ⎥ ⎥ ⎢ N (16.2) A=⎢ ⎥. 2 2 N −2 2 ⎢ ∓ ∓N ± N ∓N ⎥ ⎦ ⎣ N ∓ N2 ∓ N2 ∓ N2 ± NN−2

page 478

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computing with Square Electromagnetic Pulses

b4205-v2-ch16

479

Note that the diagonal and non-diagonal terms in Equation (16.2) are the reflection and transmission coefficients from Equation (16.1). 16.3.4. Multi-way junctions: Series and parallel Is there another way, in addition to the scattering matrix approach, to represent the performance of our EM-wave based switching mechanism? To answer this question, let us again focus on the 4-port configuration (Catt’s junction) as a representative case (note that our approach can be extended to N interconnected waves as we have detailed in previous sections in this chapter and in our recent work in Ref. [35]. In this context, we can transfer and re-direct data, in line with Refs. [11, 38, 42, 43]. Consider the cases shown in Figures 16.7 and 16.8 where the series (panels a) and parallel (panels b) are schematically shown. As shown the TEM square pulses are applied from port 1 (left port) and port 3 (right port). Let us first study the cases when the pulses have the same positive polarity (i.e., (+,+)). This case is represented in Figures 16.7(a) and 16.7(c) for the series and parallel configurations, respectively. Note that the polarities have been mapped using the approach described in Figure 16.4. As observed, the incident pulses coming from ports 1 and 4 are equally divided into four pulses after passing the crossing point

(a)

(b)

Figure 16.7. (a) Series and (b) parallel crossings with two-port excitation coming from ports 1 and 3 using equal polarities of the pulses.

page 479

August 3, 2021

480

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch16

V. Pacheco-Pe˜ na and A. Yakovlev

(a)

(b)

Figure 16.8. (a) Series and (b) parallel crossings with two-port excitation coming from ports 1 and 3 using pulses with opposite polarities. Adapted and reproduced in part from Ref. [35] with permission of Wiley.

(as explained in Section 16.3.1). Hence, a total of eight pulses are excited in the whole structure, as expected. From this, we can represent the propagation of the final pulses in each TL as the superposition of two TEM square pulses per TL. The resulting diagram of such performance is represented in Figure 16.7 for each series and parallel crossing. Note that in both scenarios the TEM square pulses propagate towards ports 2 and 4 (top and bottom TLs) while no signal travel towards ports 1 and 3 as the pulses excited in these two TLs interfere destructively. Interestingly, it can be seen how the polarity of the pulses is different for the series and parallel crossings with (−, −) and (+,+) for the pulses traveling towards ports (port 2, port 4), respectively. The performance described in Figure 16.7 can also be mathematically demonstrated by simply applying the scattering matrix approach described in Section 16.3.3 with x = [1, 0, 1, 0]. Applying y = AxT for the series and parallel crossing from Figure 16.7, the output vectors of TEM square pulses are: ⎡ ⎤⎡ ⎤ ⎡ ⎤ ±0.5 ∓0.5 ∓0.5 ∓0.5 1 0 ⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢∓0.5 ±0.5 ∓0.5 ∓0.5⎥ ⎢0⎥ ⎢∓1⎥ ⎥⎢ ⎥ ⎢ ⎥ y=⎢ (16.3) ⎢∓0.5 ∓0.5 ±0.5 ∓0.5⎥ ⎢1⎥ = ⎢ 0 ⎥ . ⎣ ⎦⎣ ⎦ ⎣ ⎦ ∓0.5 ∓0.5 ∓0.5 ±0.5 0 ∓1

page 480

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computing with Square Electromagnetic Pulses

b4205-v2-ch16

481

Demonstrating how y = [0, −1, 0, −1] and y = [0, 1, 0, 1], in agreement with Figures 16.7(a) and 16.7(b), respectively. What will happen if the TEM square pulses have different polarities? This scenario is represented in Figures 16.8(a) and 16.8(b) for the series and parallel crossings, respectively. As observed, in these scenarios the pulses traveling towards ports 2 and 4 are cancelled out and propagation is only permitted towards ports 1 and 4. For these cases, the pulses in the latter ports have the polarity (+, −) and (−, +) for the series and parallel configurations. Following the same process as before, these cases can be mathematically calculated using the scattering matrix approach as ⎤⎡ ⎤ ⎡ ⎤ ⎡ ±0.5 ∓0.5 ∓0.5 ∓0.5 1 ±1 ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎢∓0.5 ±0.5 ∓0.5 ∓0.5⎥ ⎢ 0 ⎥ ⎢ 0 ⎥ ⎥⎢ ⎥ ⎢ ⎥ y=⎢ (16.4) ⎢∓0.5 ∓0.5 ±0.5 ∓0.5⎥ ⎢−1⎥ = ⎢∓1⎥ ⎣ ⎦⎣ ⎦ ⎣ ⎦ ∓0.5 ∓0.5 ∓0.5 ±0.5 0 0 in agreement with the polarities represented in Figure 16.8. These results demonstrate how the series and parallel models for N connected transmission lines (parallel plate waveguides in our work) can act as logic switching devices. Note that this approach can be applied to N number of ports, polarities, and also to excitation of TEM pulses from orthogonal ports. We refer the readers to Ref. [35] where we provide a full description of such cases. It is interesting to remark that, as it was shown in this section, all the cases perform a decision-making process as represented in Figure 16.3 where, for instance, a pulse excited from port 1 can act as a data-sampling token enabling the operation If . . . Then . . . Else as a fundamental computing process for future computing applications using EM waves. 16.3.5. Simulation results In the previous sections, we have shown how our proposed technique for transferring and switching information using square TEM pulses in TLs can be analyzed using the scattering matrix technique and also schematic representations as those discussed in Section 16.3.4.

page 481

August 3, 2021

482

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch16

V. Pacheco-Pe˜ na and A. Yakovlev

The current section is devoted to demonstrate our approach via numerical simulations using the transient solver of the commercial software CST Studio Suite (see details of the setup in Ref. [35]). Here, we will focus our attention on the scenario where four TLs are connected in series and the incident TEM square pulses are applied from port 1 and port 3. Moreover, we will consider pulses of different polarities (+, −) for each port, respectively, in order to evaluate the case described and studied in Figure 16.8(a). For completeness, a sketch of the configuration under study is shown in Figure 16.9(a) considering parallel plate waveguides as our TLs. As shown in this figure, 0 and 1 volts are applied to the bottom and top metallic plates, as a result the incident pulses will have an Ex polarized E -field and will travel towards the crossing point along the z -axes. With this setup, the numerical results of the inplane Ex -field distribution at a time before the pulses have reached the junction between the waveguides (t = 1.8 ns) is shown in the left panel of Figure 16.9(b), demonstrating how the pulses have a (+, −) polarity (again polarities defined as our mapping described in Figure 16.4). At t = 2.4 ns, the pulses have passed the crossing point and the numerical results of the in-plane Ex and Ez -field distributions on the xz plane at this time are shown in the central and right panels of Figure 16.9(b). As observed from the Ex -field, two square pulses appear in the system (one traveling towards port 1 and one traveling towards port 3) while there are no pulses traveling towards ports 2 and 4. This performance can be corroborated by looking at the Ez -field distribution (right panel in Figure 16.9(b)) where no clear pulses are present in the system. For the sake of completeness, we have also extracted the voltage at each port as a function of time and the results are plotted in Figure 16.9(c), again demonstrating how the TEM square pulses propagate only towards ports 1 and 3 while no pulses towards 2 and 4, demonstrating a good agreement with the calculations presented in Equation (16.4) from Section 16.3.5. As our approach is scalable, the dimensions of the waveguides in all the proposed configurations can be reduced to deal with shorter square pulses. See Supplementary Information for a demonstration of

page 482

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computing with Square Electromagnetic Pulses

b4205-v2-ch16

483

(a)

(b)

(c)

Figure 16.9. Logic switching, numerical results. (a) Schematic representation of a series crossing using parallel plate waveguides when the incident pulses of polarities “+” and “−” are inserted from port 1 and 3, respectively. (b) snapshots of the electric field distributions at different times showing the incident pulses at t = 1.8 ns (left), and the Ex (center) and Ez (right) components of the electric field at t = 2.4 ns (i.e., at a time after the pulses have passed the crossing point. (c) the voltage as a function of time calculated at each port. Adapted and reproduced in part from Ref. [35] with permission of Wiley.

a ×0.01 downscaled example using a dispersive model for the metallic plates.44–46 Finally, as in all technologies, there are some challenges that our proposed technique may face. For instance, our computing approach relies on the control of the phase of the input TEM square pulses excited from multiple ports. However, this can be addressed by current technology where voltage can be accurately controlled. Moreover, in this manuscript we have provided the fundamental

page 483

August 3, 2021

484

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch16

V. Pacheco-Pe˜ na and A. Yakovlev

theory around TEM square pulse-based computing which can be further exploited if the manipulation of the phase of the source is not possible. As it is known, for instance, the required phase of the pulses can be manipulated at will by changing the length of the transmission lines and/or by using different materials filling the transmission lines.28, 30 16.3.6. Scaling down geometries Could we scale down the dimensions of the structures studied in this chapter for compact applications? This section is dedicated to this aspect and demonstrate how our approach is scalable. Let us study a scenario where three waveguides are connected in series with same dimensions as in Figure 16.9(a) but scaled down by a factor of ×0.01. For the numerical analysis below, we also scale down the temporal duration of the incident TEM square pulses (pulses with a duration of 0.005 ns). However, as it is known, reduced pulses and complex signals (such as train pulses) contain further higher order harmonics in the frequency domain. Hence, it is important to consider the effect of dispersion of the pulses in the design. As in the previous sections, we consider parallel plate waveguides filled with air. The main difference now is that, as we have scaled down the geometries, a dispersive model of metal is used for the metallic walls (rather than PEC as in the previous sections). We use silver with a relative permittivity following a Drude–Lorentz model.44, 46, 47 Let us consider the case when only one port is used to excite the structure (port 1 in this case). With this configuration, the numerical results of the in-plane E -field distribution before and after the incident pulse have crossed the central junction between the waveguides are shown in Figure 16.10. Note that, similar to the results discussed in previous sections, we have represented the Ex -field distribution for t = 0.018 ns and both Ex and Ez field distributions for t = 0.024 ns to show the temporal response of the system. As it is shown, for t = 0.024 ns, three pulses are produced each of them traveling towards ports 1–3. Moreover, note that the amplitude of the pulse in port 1 is smaller than that of

page 484

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch16

Computing with Square Electromagnetic Pulses

485

(a)

(b)

Figure 16.10. Scaling down geometries, example. (a) Schematic representation of the TL model using three connected TLs in series. The incident pulse is applied from port 1. (b) snapshots of the electric field distributions at different times showing the incident pulses at t = 0.018 ns (left), and the Ex (center) and Ez (right) components of the electric field at t = 0.024 ns (i.e., at a time after the pulses have passed the crossing point).

the square pulses traveling towards ports 2 and 3 in agreement with the values predicted by the scattering matrix from Section 16.3. For completeness, scattering matrix representation of this system is as follows: ⎡ ⎤⎡ ⎤ ⎡ ⎤ 1/3 −2/3 −2/3 1 1/3 ⎢ ⎥⎢ ⎥ ⎢ ⎥ y = ⎣−2/3 1/3 −2/3⎦ ⎣0⎦ = ⎣−2/3⎦. (16.5) −2/3

−2/3

1/3

0

−2/3

16.3.7. Pulse generation and control requirements In order to control pulse generation and absorption, we need to build a circuit based on switches. An example of such a circuit is shown in Figure 16.11. It contains a DC voltage source whose value will

page 485

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch16

V. Pacheco-Pe˜ na and A. Yakovlev

486

Figure 16.11.

Configuration of elements for pulse generation.

determine the amplitude of the pulse, a terminating resistor R whose value should be equal to the characteristic impedance of the TL Z0 and three switches, which can be implemented using high quality low resistance MOS FETs. The switches can be controlled by external binary control signals up, dn, and term. It is important that the shape of the pulses is maximally kept to square, for which we need the switches up and dn to have high gain and low resistance. We do not present in this paper the implementation of the actual control logic for the switches, but it can be done using, for example, techniques based on logic synthesis of self-timed circuits from Signal Transition Graphs, described in Ref. [48] Such a controller will require high precision delay elements to meet the needs of the pulse width. So ultimately the bandwidth of the whole system will be determined by the bandwidth of the pulse control, which at the level of today CMOS technology could be around 5–10 GHz. With the use of high-speed technologies such as GaAs the speed can go up to 50–100 GHz. Some information about the state of the art in highspeed switches can be found in Ref. [49]. Also, in order to obtain sharp-slope pulses, one can use drivers with boot-strapping.49, 50

page 486

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

Computing with Square Electromagnetic Pulses

b4205-v2-ch16

487

16.4. Future Directions In this section, we provide our view of how our technique for TEM square pulse-based computing can be used in further scenarios, currently under development. 16.4.1. 3D-based structures All these primitives can lay the foundation for if. . . then. . . else decision points. These decision points can form the basis for building either conventional logic-gate-like computing or be used to perform routing functions for switching data streams. For example, in the latter case we can envisage data packet routing in a mesh network consisting of cross-points in 2D or 3D structures. We can also consider combinations of series and parallel crossings forming interesting 3D structures. An example of such a structure could involve a combination of series and parallel crossings. For instance we could exploit the fact that opposite polarity pulses on the adjacent (with series connection) sections can form stable attractors and rotate around their parallel crossings (because for parallel crossings the polarity is the same). In a 3D case of a crystal-like structure, where the adjacent, cube-like, nodes have either “+” or “−” polarity, we can organize oscillations by having pulses simultaneously rotating around these cubes as shown in Figure 16.12. Here is a cube that is driven to (+) polarity. All of its six sides are involved in series crossings with adjacent similar cubes driven to (−) polarity. The pulses rotate around these three orbits. Such systems can, for example, provide stable clocking in the overall computational fabric. 16.4.2. Graph-based computing The above primitives and their networks can be used to develop a new type of computational structures based on graphs. For example, for a series crossing with 2-valued pulses, the process of computing can be described as consisting of elementary steps of re-writing the graphs as

page 487

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch16

V. Pacheco-Pe˜ na and A. Yakovlev

488

Figure 16.12.

Figure 16.13.

3D grid structure: one cube.

Graph-based computing: example of a series crossing.

shown in Figure 16.13. Here, α is a Boolean value while α stands for the opposite polarity, and arrows “>” and “15 V) gave rise to H2 gas bubbling at the dominant cathodic electrode because of a small nonzero Voffset , which resulted

page 589

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch20

M. Akai-Kasaya and T. Asai

590 (a)

100 μm (b)

100 μm (c)

100 μm Figure 20.3. Time series of wire growth under an applied AC voltage of 20 kHz: (a)–(c) at 15, 20, and 30 Vpp, respectively. Images were taken 5 min after the voltage application.

from electrolysis of the water solvent. The further increase in Va caused bubbling at both electrodes because the magnitudes of Voffset became relatively low compared with the increase in Va . Abrupt gas generation was observed when the number of wires increased. 20.3.3. Conductance increase of wires The conductance between the electrodes increases with the wire growth. The wire growth can be stopped at any given time, and the grown wire maintain its shape and conductance; therefore, it can be used for resistive memory. The conductance of the wire was measured by applying a low DC voltage (0.1 V), for avoiding polymerisation. A test voltage of >0.4 V yielded thicker grown wires, although

page 590

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch20

Evolving Conductive Polymer Neural Networks on Wetware

100 μm

591

50 μm

3 2 1 0 0

100 200 300 Growth cycle (∝Time) (a)

Conductance between electrodes [uS]

20

Conductance (μS)

Conductance (μS)

4 20

15 15

10 10

5 5

00 0

0

50

100 cycle 200 300 400 [a.u.] Growth cycle (∝Time) 100

150

200

250

300

350

400

(b)

Figure 20.4. Conductance change between electrodes with polymer growth under an applied AC voltage at (a) 10 kHz with 40 Vpp for electrodes with a 400-µm gap and (b) 100 kHz with 25 Vpp for round electrodes with a 50-µm gap.

the oxidation potential of the monomer EDOT was approximately −0.8 V. Figure 20.4(a) shows the conductance increase between electrodes with triangular shape. An abrupt increase in the conductance occurred after 60 cycles because of the first wire’s contact. The conductivity of the PEDOT:PSS wire grown in this manner was 0.01–1 S/cm, without wire-diameter dependence. The first and second wire connections exhibited a stepwise increase, and the subsequent increase was continuous because the multiple wires formed multiple connections with each other. The maximum conductance was determined according to the saturation of the number of wires or the occurrence of H2 gas bubbling, which disrupted the wiring physically. The breakout condition of the H2 gas complexity depended on Vpp , the frequency of the applied AC voltage, and the shape of the electrodes. The optimal condition for the machine-learning process was continuous, and there was a linear-conductance increase over a wider

page 591

August 3, 2021

592

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch20

M. Akai-Kasaya and T. Asai

range. Figure 20.4(b) shows the wire growth between electrodes with a round shape apex with a fs of 100 kHz. The conductance increased continuously and reached a saturated value of 17 μS. The inset shows a large number of wires with a high density, where it is difficult to see the wire because their diameters were approximately 0.2 μm. No wires grew between the wider electrode with a larger curvature radius of 100 μm and a gap length of 50 μm until Vpp exceeded 40 V, which caused H2 gas bubbling. 20.3.4. Directional growth of wire and 3D growth The wires grew along the electrical potential gradient in the solution. The primary wire generally grew close to the shortest path between the apex of the electrodes. The subsequent wire growth frequently exhibited elongation along the outer line of the electrical force, as shown in Figures 20.3(a) and 20.4(a). They wires eventually moved and fell into the centre wire fascicle, as shown in Figures 20.3(b) and 20.3(c). The drastic movement of the wire is attributed to the heat convection of the solution originated from the electrical current flow. The high directivity of the wire growth can be used to realize multiple wiring for multiple electrodes. Figure 20.5(a) shows the wire sorting for three electrodes from one electrode controlling the respective conductance. The growth voltage applied to the three plural electrodes was 20 kHz and 25 Vpp , with a 0.1-V offset, which advanced wire growth from the single counter electrode. A programmed Arduino controlled the mechanical relay of the plural electrode to attain the conductivity ratio, where the conductance ratio of the electrodes was set as 2:1:5 (top:middle:bottom). After a wire was connected to the middle electrodes, growth to the bottom and top electrodes was conducted and eventually attained an ordered ratio. The performance of multiple wiring is a peculiar competence of polymer wire growth. It is possible for the polymer wire to crosslink and grow between the designated electrodes. The wire directionally grows along the electrical potential gradient; thus, it can connect the designated terminals in the 3D free space, in principle. Polymer wiring in 3D space using sharpened metal

page 592

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch20

Evolving Conductive Polymer Neural Networks on Wetware (a)

593

(b)

50 μm 100 μm

(c)

Conductance (μS)

2.5 2.0 1.5 1.0 0.5 00

100 μm

10 20 30 40 50 Growth cycle ( Time)

60

Figure 20.5. (a) Plural polymer wiring control branching to three different electrodes from a single one under an applied AC voltage of 20 kHz and 25 Vpp with a 0.1-V offset. The conductance ratio was set as 2:1:5 (top:middle:bottom electrodes). Optical microscope images of a PEDOT:PSS wire grown between (b) Cu wire electrodes with a diameter of 50 µm buried in UV resin and (c) a single electrode and three opposite electrodes, where the electrodes are Au wires with diameters of 100 µm.

tips and buried metal terminals was also attempted. Wiring was easily attained, as shown in Figures 20.5(b) and 20.5(c); however, the conductance control of the 3D multi-branching was difficult. It is considered that the multi-terminals shown here were too large. According to the conditions investigated, a challenge to overcome was fabricating multiple electrodes with a small contact area with the solution, for example, 102 –1002 μm2 , with a fine pitch of 50–400 μm, facing each other at a distance of 50–400 μm. A small metalexposed area is effective for creating a sharp pathway of the electrical potential gradient. The proper ratio of the pitch to the gap length is needed to branch the wires connecting a number of vicinal counter

page 593

August 3, 2021

594

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch20

M. Akai-Kasaya and T. Asai

electrodes. It is considered that the finely fabricated multi-terminals make the 3D polymer wiring possible. 20.4. Machine Learning for Polymer Wire Growth An elemental function in learning is synapse plasticity, and the synapse component in a physical ANN structure must have a resistive change ability and memory function. The synaptic weight in an ANN should increase and decrease in accordance with the order from a learning program. The direction of the conductance change of the polymer wire growth is only increase. In a physical ANN configuration, the synapse weight consists of two positive and negative elements, because the negative weight plays the critical role of the product-sum operation in the ANN. It is a general way to control the conductance of memristive devices in an increasing manner through learning, and the decrease operation is used only when the saturated device must be reset. Therefore, learning should be completed before the wire growth is saturated. 20.4.1. Supervised learning: Simple perceptron for simple logic gates Simple logic AND and OR gates consisting of two input and one output neurons connected to two negative and positive synapses, as shown in Figure 20.6(a), trained by the simple perceptron algorithm, were preliminarily demonstrated. Four pairs of electrodes, which function as synapses, were connected to four electrical circuits, which were used to measure the conductance, as shown in Figure 20.6(b), where the circuit and relay are abbreviated. The test current flowing through the polymer wire was converted to voltage and read by the Arduino as synapse weights. The Arduino controlled the mechanical relay according to the installed simple perceptron program. The program starts, and the TEST protocol switches all the relays to the test circuit. When the test circuit is connected, a DC test voltage of 0.1 V is applied. The Arduino reads serial output values for all the synapses — 1p, 1n, 2p, and 2n — and memorizes them as synapse weights of w1p , w1n , w2p , and w2n , respectively.

page 594

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch20

Evolving Conductive Polymer Neural Networks on Wetware (a)

(b) Neuron Synapse

X1

1p 1n Neuron 2p 2n X2

Conductance (μs)

(d)

0.4

0.2

v0.1 0

θ

w1p w1n w2p w2n

0.3

1p 1n

Neuron

595

(c)

2p

Start

2n

Switch to TEST circuit Apply TEST voltage Read w1p, w1n, w2p, w2n 㻌

(e)

1p

1n

2p

Call X1,X2 Work out Y

2n YES

Y=T ? NO Make growth order Δw1+ , Δw1- , Δw2+ , Δw2Switch the GROWTH circuit

400 μm

0

50 100 150 200 250 Learningvcycle (epoch)

Figure 20.6. (a) Two-input and one-output single ANN layer, where a synapse consists of two positive and negative divided synapses. (b) Machine-learning system for polymer wire growth, which consists of a substrate with four pairs of electrodes, four electrical circuits that can switch between the growth and test modes, an AC power supply, a DC power supply, and an Arduino. (c) Flowchart of the simple perceptron algorithm. (d) Conductance change during AND-gate learning. (e) Optical microscope image taken after AND gate learning for four pairs of electrodes and polymer wires between 1p and 2p electrode pairs.

Next is a random call of inputs X1 and X2 from four combinations of 1 and −1. With the synapse weights and inputs, a perceptron’s output Y is calculated as follows: Y = sgn {(w1p − w1n ) X1 + (w2p − w2n ) X2 − θ} ,

(20.2)

where sgn (x) represents the sign function for splitting a product to 1 or −1, that is, if x < 0, sgn (x) = −1, and if x 0, sgn (x) = 1. θ represents the decision threshold, which affects the learning rate because the initial weight of the physical synapses is not zero. A finite θ value was needed to prompt the smooth progress of learning in an earlier stage. In the next step of the decision, if output Y is identical to the supervised signal T , the YES flow returns to a new epoch. If Y is not T , the NO flow moves to learning, where some synapses must change their weight to approach the correct answer. The training signal is generated using the following error function: Wi = ε(T − Y )Xi ,

(20.3)

page 595

August 3, 2021

596

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch20

M. Akai-Kasaya and T. Asai

where ε is a positive constant. For example, if the inputs X1 and X2 are 1 and −1, respectively, for the AND gate, T is −1, and Y is 1. When ε = 0.05, the training functions ΔW1 and ΔW2 are −0.1 and +0.1, respectively, indicating that the weights of synapses 1 and 2 should decrease and increase by 10%, respectively. For the polymer case, the order of the synapse weight update is converted to control the mechanical relay. Here, the Arduino program designs the order to form a connection with the AC growth power supply and GND for only the synapses of 1n and 2p. The relays for the other synapses without a growth order and the synapse with the right answer, that is ΔW = 0, maintain the neutral position. After the learning is complete, the decision of Y = T always becomes YES. Therefore, the test is repeated, but no further learning order is commanded. Figure 20.6(d) shows typical conductance changes for learning of the AND gate. The conductance of w1p abruptly increased to 0.3 μS at the 130th epoch, but 2p exhibited no wire bridging for a considerable period. w2p abruptly increased to 0.6 μS, and w1p subsequently increased to 0.7 μS at approximately the 290th epoch. After polymer bridging in electrode pairs 1p and 2p was attained, the output results (Y ) for every input combination yielded the correct answer against the supervised signal T of the AND gate. A pair of single wires or a few polymer wires attained their logic functions by maintaining a balance of the synapse weight. The learning for the OR, NAND, and NOR cases was performed as well. 20.4.2. Unsupervised learning: Autoencoder for feature extraction An autoencoder is an unsupervised learning algorithm of an ANN that learns a representation for a set of data, typically reducing the data dimensionality. The encoded data has abstract features extracted from the original high-dimensional data. Recently, powerful AI algorithms have employed autoencoders inside their deep neural networks; thus, the autoencoder concept has become more widely used for learning. The autoencoder generally comprises a hidden layer between the encoder and decoder, as shown in Figure 20.7(a).

page 596

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch20

Evolving Conductive Polymer Neural Networks on Wetware

597

h1+–h1– h2+–h2– h3+–h3–

(a)

(b)

Figure 20.7. Unsupervised autoencoder learning with 54 synapses. (a) Representation of an autoencoder consisting of input neuron Vi , hidden neuron hj , and output neuron Vdi forming two encode and decode ANN layers. Hidden neuron hj consists of positive and negative divided neurons. (b) Flowchart of the autoencoder algorithm.

The encoder compresses the input data Vi through a synapse network of wij to a low-dimensional dataset of hj . The decoder decompresses the dataset hj through the same synapse network of wij to the new dataset of Vdi . The objective of the autoencoder algorithm is the selfreplication of Vdi from Vi through a change in the synapse weight wij . If the input Vi and output Vdi differ, only the erroneous wi values are updated; thus, no supervised data are needed, as shown in Figure 20.7(b). The learning is complete when all the Vdi values are identical to Vi for all the input databases. A polymer ANN that realizes feature extraction of three 3 × 3 binary letters into three-pixel codes has been demonstrated via autoencoder machine learning. The ANN consists of nine input neurons and three hidden neurons, where a hidden neuron should comprise positive and negative synapses; therefore, the ANN requires 54 synapses. The autoencoder employing a 9 × 6-polymer ANN used nine substrates with six pairs of electrodes, corresponding to

page 597

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch20

M. Akai-Kasaya and T. Asai

598 (a)

h1+

h1– h2+

V6

h2– h3–

h3+ 50 μm

V6

w61+ w61– w62+ w62– w63+ w63–

h1+ h1– h2+ h2– h3+ h3–

(b)

–

Figure 20.8. (a) Glass substrates with six pairs of Au electrodes, corresponding to an input neuron (Vi ) and six hidden neurons (hj ). The pairs of electrodes had a sharp apex with a gap of 250 µm. (b) Electric circuit board used for autoencoder learning.

a single input neuron (Vi ) and six hidden neurons (hj± ), as shown in Figure 20.8(a). The nine divided terminals of the respective hidden neurons hj± were short-circuited at outside of the substrates. Each of the nine Vi neurons and six hidden hi± neurons were connected to the respective electrical circuit boards consisting of four relays, an op amp-based I/V converter, and a single terminal to the electrode for the polymer, as shown in Figure 20.8(b). A total of 15 boards were connected to each neuron of Vi and hi . The autoencoder algorithm has two test phases. One is encoding to determine the hidden neuron hi with input Vi , and the other is the decoding process to determine the output neuron Vdi with input hi . At the first test phase, the positive and negative test DC voltages, that is, +0.1 V and −0.1 V correlating with binary data of a letter

page 598

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch20

Evolving Conductive Polymer Neural Networks on Wetware

599

(X, H, or T), were applied to Vi for 0.1 s, where the 1 and −1 corresponded to black and white (as well as positive and negative test voltages), respectively. Here, the system measured the current flows into hj+ and hj− at a given time, which correlated with the result of the sum of products 9i (wij+ Vi ) and 9i (wij− Vi ), respectively. The signals encoded from nine-bit to three-bit at hidden neuron hi are calculated using the current values, as follows: 9 9 (wij+ Vi ) − (wij+ Vi ) − θ , (20.4) hj = sgn i

i

Then, hidden neuron hj should have a dataset of three pixels from the eight types shown at the bottom of Figure 20.7(a). In the next test phase, a test DC voltage of +0.1 or −0.1 V correlating with hj was applied to hj± , and the current flow into neuron Vdi was measured, where the neurons Vi and Vdi correspond to the single physical terminal Vi . Here, when hi was +1, +0.1 V and −0.1 V were simultaneously applied for hi+ and hi− , respectively; when hi was −1, voltages of opposite polarity were applied. The measured current at a board-connected terminal of neuron Vi was correlated with 3j (wij hj ). The decoded Vdi was calculated as follows: ⎧ ⎫ 3 ⎨ ⎬ (wij ) − θ . (20.5) Vdi = sgn ⎩ ⎭ j

In the learning phase, when Vi and Vdi differ, the synapse weight should be increased or decreased in accordance with the following error function: Wij = (Vi − Vdi )hj .

(20.6)

When Wij is nonzero, terminals of the selected neurons i and j are connected to the AC growth voltage of 20 kHz and 40 Vpp and the GND for 1 s; thus, the polymer wire in wij grows. The polarity of Wij selects the learning terminal hi+ or hi− . The remaining terminals are floated. After the learning phase, the two test phases start again. When all the Vi − Vdi values become zero in each epoch

page 599

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch20

M. Akai-Kasaya and T. Asai

Error Late ( Ver./9)

600

0.8 0.6

No

㼂㼕㻌㻩㻌㼂㼂㼐㼕

䕔䖃䕺

0.4

XX HH TT

0.2 0 2.5

Conductance (μs)

2.0 1.5 1.0 0.5 0 0

200 400 Learning cycle (epoch)

600

Figure 20.9. Typical result of the autoencoder: error rate (upper) and conductance change (lower) of a successful autoencoder learning via polymer wire synapses, which can encode three nine-pixel characters (X, H, T ) into respective three-pixel data.

many times repeatedly, no weight update occurs, and the learning is complete. A typical result of the autoencoder machine learning using polymer wire growth for three nine-pixel characters X, H, and T is shown in Figure 20.9. The upper panel shows the error rate, which is the ratio of the number of Vi = Vdi in 9, and the lower panel shows the conductance change for all 54 synapses during the learning. One learning epoch corresponded to one cycle for a dataset. The three characters were input in series every three epochs. Clearly, the error rate decreased as the learning proceeded, eventually reaching zero. The conductance of the polymer increased exhibiting small fluctuations. The conductance increase started after 100 epochs. The network results achieved for encoding of the letters X, H, and T to (1, −1, −1), (−1, −1, −1), and (1, 1, 1), respectively, where these combinations were incidentally determined through the learning.

page 600

August 3, 2021

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch20

Evolving Conductive Polymer Neural Networks on Wetware

601

(a))

30 20 10 0

0

0.5 1.0 1.5 2.0 Synapses weight (μs) (c)

Number of synapse

Number of synapse

(b)

30 20 10 0

0

0.5 1.0 1.5 2.0 Synapses weight (au) (d)

Figure 20.10. (a) Update order for wire growth (horizontal line) and conductance change for all 54 synapses (gray depth) shown with the learning flow from top to bottom. (b) Software simulation results for the same update order, weight change, and autoencoder task. Histogram of the synapse weight that constitutes the accomplished autoencoder for (c) the conductance of the polymer wire and (d) the numerical value in the software.

Figure 20.10(a) shows the update order for wire growth (indicated by a red line), as well as the resulting conductance change for all 54 synapses, where the learning epoch flows from top to bottom. The lower block in Figure 20.10(b) shows the same update order and weight change of a synapse simulated in software for comparison. While the polymer autoencoder converged after 500 learning epochs, in the software, the same autoencoder converged within approximately 30–60 learning epochs. This is because the single growth order for the polymer does not make a monotonical conductance increase, owing to the noticeable nonlinear conductance increase

page 601

August 3, 2021

602

17:52

Handbook of Unconventional Computing (in 2 Vols.) - 9in x 6in

b4205-v2-ch20

M. Akai-Kasaya and T. Asai

against growth order, as shown in Figure 20.4(a). Interestingly, the polymer ANN consisted of a small number of synapses with weight. As shown in Figures 20.10(c) and 20.10(d), while the polymer ANN had almost 30 synapses with zero weight, the software-simulated ANN always had