N-bit parity neural networks: new solutions based on linear programming

doi:10.1016/S0925-2312(01)00612-9

Neurocomputing

Volume 48, Issues 1–4, October 2002, Pages 477-488

https://doi.org/10.1016/S0925-2312(01)00612-9 Get rights and content

Abstract

In this paper, the N-bit parity problem is solved with a neural network that allows direct connections between the input layer and the output layer. The activation function used in the hidden and output layer neurons is the threshold function. It is shown that this choice of activation function and network structure leads to several solutions for the 3-bit parity problem using linear programming. One of the solutions for the 3-bit parity problem is then generalized to obtain a solution for the N-bit parity problem using ⌊N/2⌋ hidden layer neurons. Compared to other existing solutions in the literature, the present solution is more systematic and simpler. Furthermore, the present solution can be simplified by using a single hidden layer neuron with a “staircase” type activation function instead of ⌊N/2⌋ hidden layer neurons. The present activation function is easier to implement in hardware than those in the literature for N-bit parity networks. We also review similarities and differences between the present results and those obtained in the literature.

Introduction

The XOR/parity problem has a long history in the study of neural networks. It is used in [6] as a basis for illustrating the limitations of the computational power of perceptrons. The parity mapping problem has since been recognized as one of the most popular benchmarks in evaluating neural network training algorithms [2]. The N-bit parity function is a mapping defined on 2^N distinct binary vectors that indicates whether the sum of the N components of a binary vector is odd or even. When N=2, the parity function is the exclusive-or (XOR) logic function. Let β=[β₁,β₂,…,β_N]^T be a vector in B^N where $B^{N} ≜{β∈R^{N} : β_{k} =0$ or 1 for k=1,2,…,N}. The mapping $ψ : B^{N} →B^{1}$ is called the N-bit parity function if $ψ(β)= 1 if ∑ k=1 N β_{k} is odd, 0 if ∑ k=1 N β_{k} is even .$ The parity mapping is considered difficult for neural network learning since changes in a single bit results in changes in the output.

It is previously thought that a standard feedforward neural network model requires N hidden layer neurons to solve the N-bit parity problem [7]. The network in [7] uses the sigmoidal transfer function, $f(u)= 1 1+ e^{−u},$ and is trained using the generalized delta rule. Although it is shown in [7] that the XOR function could be realized using a neural network with a single hidden layer neuron and direct connections between the input and output layers, no generalization utilizing such a structure is made for the parity function with N>2.

In [9], it is shown that standard feedforward neural networks can be used to solve the N-bit parity problem with ⌈(N+1)/2⌉ hidden layer neurons, where ⌈·⌉ stands for rounding towards +∞. Using the same network structure, a solution is proposed in [8] for the N-bit parity problem. The connection weights between the inputs and hidden layer neurons can be obtained trivially, while the connection weights between the hidden layer neurons and the output layer neuron can be obtained by solving a system of linear equations.

The implementation of the N-bit parity problem using a different network structure is proposed in [5]. The implementation leads to a neural network with an output y=u−2η, where $η= ∑ j=1 [N/2] 1 1+ e^{−40(u−2j+0.15)},$ $u=∑_{i=1}^{N} x_{i}$ and [·] indicates truncation to the nearest integer. The connection weights from the inputs x_i to u and from u to the hidden and output layer neurons are all equal to 1. It is also noted in [5] that the node u can be replaced by direct connections from the input nodes to the hidden and output nodes.

It is shown in [10] that using a standard feedforward neural network, the N-bit parity problem can be solved with just two hidden layer neurons. The activation function used in both hidden units is $f(u)= 1 N u− cos (πu) απ,$ where the choice of α>1 ensures that the function is monotonically increasing. One of the hidden units has a constant bias of −1 while the other has zero bias. The output unit is a linear threshold unit with a constant bias of −1/N. All connection weights are equal to 1 except the weight from the first hidden unit to the output unit which is −1.

When direct connections between the input layer and the output layer are introduced, it is shown in [1], that the problem can be solved with a structure that requires only one hidden layer neuron. The activation function used in the hidden layer neuron is the continuously differentiable “step” function, $f(u)=⌊u⌋+ sin^{K} π(u−⌊u⌋) 2,$ where K>1 and ⌊·⌋ stands for rounding towards −∞.

In this paper, the N-bit parity problem is solved with a neural network that allows connections between the input layer and the output layer. The threshold transfer function is used in the hidden and output layer neurons. It is shown that a set of solutions for the 3-bit parity problem can be obtained using linear programming. One of these solutions is then generalized to obtain a solution for the N-bit parity problem using ⌊N/2⌋ hidden layer neurons. Furthermore, the present solution can be simplified by using a single hidden layer neuron with a “staircase” type activation function instead of ⌊N/2⌋ hidden layer neurons. The present paper makes contributions beyond that of [3] by solving the N-bit parity neural networks systematically using linear programming. We also note that there has been a recent follow-up to our letter [3] published in 1999 (see, e.g., [4]).

Section snippets

Solutions to the 3-bit parity problem

It can easily be shown that all mappings $ψ : B^{3} →B^{1}$ are realizable using the architecture in Fig. 1. The 256 distinct mappings of this type can be generated by appropriately selecting the connection weights w₁,w₂,w₃,k,k₁,k₂,k₃ and bias values b₁ and b₂. The output of the neural network model shown in Fig. 1 is given by $y=f(x_{1},x_{2},x_{3})=ϕ[w_{1} x_{1} +w_{2} x_{2} +w_{3} x_{3} +b_{1} +kϕ(k_{1} x_{1} +k_{2} x_{2} +k_{3} x_{3} +b_{2})]$ where the activation function ϕ is given by $ϕ(u)= 1 if u>0, 0 if u⩽0.$

In the present section, we will show how to use linear

Solutions to the N-bit parity problem

The general solution to the N-bit parity problem can be obtained from the solution for the 3-bit parity problem and the resulting structure of Fig. 1. We make note of the fact that when the input x₃ and all of its corresponding connections are removed, we obtain a well known solution to the XOR problem using only one hidden neuron and direct connections between the inputs and the output. The results of column 1 in Table 2 indicate that the 3-bit parity problem (N=3) is realized with the same

Conclusion

We have shown that when direct connections are allowed between the input neurons and the output neuron, the N-bit parity mapping problem can be solved using neural networks requiring ⌊N/2⌋ hidden layer neurons. It is shown that through the choice of a “staircase” type activation function, the ⌊N/2⌋ hidden layer neurons can be further combined into a single hidden layer neuron. Our solution to the N-bit parity problem is generalized from a solution obtained for the 3-bit parity problem using

References (11)

D.A. Brown
N-Bit parity networks
Neural Networks
(1993)
M.E. Hohil et al.
Solving the N-bit parity problem using neural networks
Neural Networks
(1999)
E. Lavretsky
On the exact solution of the parity-N problem using ordered neural networks
Neural Networks
(2000)
J.M. Minor
Parity with two layer feedforward nets
Neural Networks
(1993)
R. Setiono
On the solution of the parity problem by a single hidden layer feedforward neural network
Neurocomput.
(1997)

There are more references available in the full text version of this article.

Cited by (24)

Exploring Parity Challenges in Reinforcement Learning Through Curriculum Learning with Noisy Labels
2024, SSRN
Exploring Parity Challenges in Reinforcement Learning through Curriculum Learning with Noisy Labels
2023, arXiv
A Graph-Neural-Network Decoder with MLP-based Processing Cells for Polar Codes
2019, 2019 11th International Conference on Wireless Communications and Signal Processing, WCSP 2019
An Effective Hybrid Approach for Optimising the Learning Process of Multi-layer Neural Networks
2019, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Separability is not the best goal for machine learning
2018, arXiv
Evolutionary Multi-task Learning for Modular Knowledge Representation in Neural Networks
2018, Neural Processing Letters

View all citing articles on Scopus

Derong Liu received the Ph.D. degree in electrical engineering from the University of Notre Dame, Notre Dame, Indiana, in 1994; the M.S. degree in electrical engineering from the Institute of Automation, Chinese Academy of Sciences, Beijing, China, in 1987; and the B.S. degree in mechanical engineering from the East China Institute of Technology (now Nanjing University of Science and Technology), Nanjing, China, in 1982. From 1982 to 1984, he worked at China North Industries Corporation, Jilin, China. From 1987 to 1990, he was an instructor at the Graduate School of the Chinese Academy of Sciences, Beijing, China. From 1993 to 1995, he was with General Motors Research and Development Center, Warren, Michigan. From 1995 to 1999, he was Assistant Professor in the Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, New Jersey. Currently, he is Assistant Professor in the Department of Electrical and Computer Engineering, University of Illinois, Chicago, Illinois. He is coauthor of the book Dynamical Systems with Saturation Nonlinearities: Analysis and Design with A.N. Michel, New York: Springer-Verlag, 1994.

Dr. Liu was a member of the Program Committees of 11th and 14th IEEE International Symposium on Intelligent Control (1996 and 1999), 2000 International Conference on Artificial Intelligence, and 39th IEEE Conference on Decision and Control (2000); he was the Local Arrangements Chair for the 6th IEEE Conference on Control Applications (1997); he was a member of the Conference Editorial Board of the IEEE Control Systems Society (1995–2000); and he served as an Associate Editor for IEEE Transactions on Circuits and Systems-I: Fundamental Theory and Applications (1997–1999). Dr. Liu is a member of the Program Committee of the 16th IEEE International Symposium on Intelligent Control (2001); he is a member of the Program Committee of the IFIP/IEEE International Conference on Management of Multimedia Networks and Services (2001); he is a member of the Program Committee of the 40th IEEE Conference on Decision and Control (2001); and he serves as an Associate Editor for IEEE Transactions on Signal Processing.

Dr. Liu was recipient of Michael J. Birck Fellowship from the University of Notre Dame (1990); he was recipient of Harvey N. Davis Distinguished Teaching Award from Stevens Institute of Technology (1997); and he was recipient of Faculty Early Career Development (CAREER) award from the National Science Foundation (1999). He is a member of Eta Kappa Nu.

Myron E. Hohil received the B.E. degree in 1991, the M.E. degree in 1994 and the Ph.D. degree in 1998 from Stevens Institute of Technology, Hoboken, NJ, all in Electrical Engineering. From 1996 to 1997 he served as a faculty member of the Electrical and Computer Engineering Department at Stevens Institute of Technology. He is currently a principal consultant to the U.S. Army Armament Research, Development and Engineering Center, Tank-automotive and Armaments Command (TACOM-ARDEC). A member of the IEEE, his research interests include neural networks, array signal processing, sensor fusion and multiple target tracking.

Dr. Stanley H. Smith received the B.A.E., M.A.E., Ph.D. (Engineering Science), and M.E. (honorary) degrees from New York University, Cornell University, New York University and Stevens Institute of Technology, respectively. In 1966, he joined the faculty of the Stevens Institute of Technology in Hoboken, N.J., where he is currently a Professor (Emeritus) of Electrical and Computer Engineering. From 1992 to 1996, he was the Head of the Department of Electrical Engineering and Computer Science.

Dr. Smith has served on technical evaluation and program committees and expert panels for SDIO (now BMDO), BTI, DARPA and Department of the Army programs. The areas that he was involved with were sensors/seekers, electronics, computer architectures, algorithm development and communications. He is a member of the Board of Advisers of the Northeast Region of the Federal Laboratory Consortium (FLC) for Technology Transfer, encompassing all U.S. Government laboratories in New Jersey, New York and New England. He was also a member of the National Advisory Group of the FLC.

His present research interests are concerned with multimedia, interactive video and wireless communications, sensors/seekers, multispectral image processing and pattern recognition (autonomous target recognition), application specific computer architectures (ASCA) for high-speed VLSI and VHSIC realizations including neural networks and embedded computer design. Dr. Smith is the author of more than ninety technical reports and papers. He was a contributing author in books on digital signal processing and on product defects for the legal profession. He is a Senior Member of the Institute of Electrical and Electronics Engineers (IEEE), the IEEE Computer Society, the IEEE Communications Society and the Society for Photographic and Instrumentation Engineers (SPIE).

¹: D. Liu's work was supported by the National Science Foundation under Grant ECS-9732785.

View full text

N-bit parity neural networks: new solutions based on linear programming

Abstract

Introduction

Section snippets

Solutions to the 3-bit parity problem

Solutions to the N-bit parity problem

Conclusion

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neurocomput.