Low power, less occupying area, and improved speed of a 4-bit router/rerouter circuit for low-density parity-check (LDPC) decoders

Chinnaiyan Senthilpari; Rosalind Deena; Lee Lini

doi:10.12688/f1000research.73404.2

Home Browse Low power, less occupying area, and improved speed of a 4-bit router/rerouter...

ALL Metrics

-

Views

-

Downloads

Get PDF

Get XML

Export

▬

✚

Research Article

Revised

Low power, less occupying area, and improved speed of a 4-bit router/rerouter circuit for low-density parity-check (LDPC) decoders

[version 2; peer review: 2 approved]

Chinnaiyan Senthilpari ¹, Rosalind Deena¹, Lee Lini¹

PUBLISHED 14 Nov 2022

Author details Author details

¹ Faculty of Engineering, Multimedia University, Cyberjaya, Selangor, 63100, Malaysia

Chinnaiyan Senthilpari
Roles: Methodology, Software, Validation, Writing – Original Draft Preparation, Writing – Review & Editing

Rosalind Deena
Roles: Conceptualization, Data Curation, Software, Writing – Original Draft Preparation, Writing – Review & Editing

Lee Lini
Roles: Formal Analysis, Investigation, Project Administration, Resources, Supervision, Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS

This article is included in the Research Synergy Foundation gateway.

Abstract

Background: Low-density parity-check (LDPC) codes are more error-resistant than other forward error-correcting codes. Existing circuits give high power dissipation, less speed, and more occupying area. This work aimed to propose a better design and performance circuit, even in the presence of noise in the channel.
Methods: In this research, the design of the multiplexer and demultiplexer were achieved using pass transistor logic. The target parameters were low power dissipation, improved throughput, and more negligible delay with a minimum area. One of the essential connecting circuits in a decoShder architecture is a multiplexer (MUX) and a demultiplexer (DEMUX) circuit. The design of the MUX and DEMUX contributes significantly to the performance of the decoder. The aim of this paper was the design of a 4 × 1 MUX to route the data bits received from the bit update blocks to the parallel adder circuits and a 1 × 4 DEMUX to receive the input bits from the parallel adder and distribute the output to the bit update blocks in a layered architecture LDPC decoder. The design uses pass transistor logic and achieves the reduction of the number of transistors used. The proposed circuit was designed using the Mentor Graphics CAD tool for 180 nm technology.
Results: The parameters of power dissipation, area, and delay were considered crucial parameters for a low power decoder. The circuits were simulated using computer-aided design (CAD) tools, and the results depicted a significantly low power dissipation of 7.06 nW and 5.16 nW for the multiplexer and demultiplexer, respectively. The delay was found to be 100.5 ns (MUX) and 80 ns (DEMUX).
Conclusion: This decoder’s potential use may be in low-power communication circuits such as handheld devices and Internet of Things (IoT) circuits.

Keywords

LDPC decoder, multiplexer, demultiplexer, pass transistor logic

Corresponding author: Chinnaiyan Senthilpari

Competing interests: No competing interests were disclosed.

Grant information: This work was supported by the Multimedia University, Grant Number MEC 21166.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Copyright: © 2022 Senthilpari C et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Senthilpari C, Deena R and Lini L. Low power, less occupying area, and improved speed of a 4-bit router/rerouter circuit for low-density parity-check (LDPC) decoders [version 2; peer review: 2 approved]. F1000Research 2022, 11:7 (https://doi.org/10.12688/f1000research.73404.2) First published: 05 Jan 2022, 11:7 (https://doi.org/10.12688/f1000research.73404.1) Latest published: 14 Nov 2022, 11:7 (https://doi.org/10.12688/f1000research.73404.2)

Revised Amendments from Version 1

According to the reviewer's comments, the manuscript has changed. In this version, the abstract methods rephrased the sentence and added the area value of the DeMUX and MUX circuits. The design methods, Some of the sentences are rephrased, and table 1 and table 2 has removed. The equation has been modified, and the equation number has been added. The necessary texts are added in the design method and result and discussion. According to the reviewer, the correction has been added. Table (2) shows the percentage of improvement added regarding the area and power dissipation. Finally, small corrections are made throughout the paper. In this new version, the paper quality has improved.

To read any peer review reports and author responses for this article, follow the "read" links in the Open Peer Review table.

Introduction

Low-density parity-check (LDPC) codes are considered more error resistant when compared to other forward error-correcting codes. These error-based circuits have been proved by their performance in the presence of noise in the channel.¹ Hence, LDPC decoders have been used more actively for communication applications. Different approaches may be used in the design of an LDPC decoder. One such structure is the layered approach, consisting of a layered design, memory unit, computational block, full adders, parity check unit, bit update unit, and router/reverse router circuits.² The decoding process begins with data being received into the decoder through the bit update block. The bit update block receives data, arranges them into vectors according to the system requirements, and stores them. These data are routed to the parallel adder through the routing circuit and the data bus. The parallel adder now computes the memory block stored in the previous iteration and the new vector. The output of the computation is checked for errors using the parity checker.³ The result goes through another computation process to generate the original vector stored in the bit update unit for the next iteration. Also, new values after the parity check are stored in the memory block.

Routers are integral to this architecture, sending data bits through the routers’ different layers. Routers are multiplexer or demultiplexer circuits that select appropriate data to be sent or distribute the received data bits to other units. Multiplexers (MUX) and Demultiplexers (DEMUX) form the basic units of data paths. They are used in applications like processor buses in CPUs, network switches, and digital signal processing stages involving resource sharing and graphic controllers. In large-scale systems, multiplexers aid in the reduction of integrated circuits used in some designs. In this research, the design of the multiplexer and demultiplexer is achieved using pass transistor logic.⁴ According to existing authors of the multiplexer, demultiplexer, and LDPC encoder circuits, a higher number of transistors leads the critical path and results in higher power dissipation.⁵

The proposed method reduced the number of transistors in the design and the regular arrangement of transistors, thereby reducing the critical path. The target was low power dissipation, improved throughput, and smaller delay with a minimum area. Low power design is essential when this circuit is used along with many other components for communication purposes. Pass Transistor Logic (PTL) can reduce the number of transistors by eliminating redundant transistors. Here the transistors act as switches to pass different logic levels between nodes of a circuit. This paper’s main objective was to design and develop routers and bit update blocks for the LDPC decoder. The proper router, rerouted, and LDPC circuit design reduces the critical path, power dissipation, and speed increases. This paper reviews the related work in designing multiplexers and demultiplexers and describes the design methodology used in the proposed circuits. The results obtained from the simulation are analyzed, and conclusions are then made regarding the proposed circuits.

Literature review

Unlike the main building blocks, such as adders and parity checkers, routers form a crucial support system for the decoder. The routers’ function, mainly comprised of multiplexers and demultiplexers, helps arrange data bits according to the system configuration and passes the information through appropriate layers. Binary signals control multiplexers.² The analogue MUX/DEMUX was designed using ternary inverters to control the circuits, and CMOS transmission gates were used.⁶^–⁸ The design improved and proved excellent for ternary inverters. With the idea of switching activities suggested by Anitha and Javachitra,⁹ adiabatic logic reduces the power by offering back the stored energy to the supply, and this was used for the 16:1 multiplexer and 1:16 demultiplexer. The results indicated that they had less power dissipation than conventional CMOS circuits. An 11 Gb/s CMOS demultiplexer using redundant multi-valued logic (RMVL) was proposed by Ahn and Kim (2006).¹⁰ The circuit received serial binary data, converted to parallel redundant multi-valued data. The converted data are reconverted to parallel binary data. This makes it possible to achieve higher operating speeds than conventional binary logic. The implemented DEMUX consisted of eight integrators and was designed with a 0.35 μm standard CMOS process. The DEMUX achieved the maximum data rate of 11 Gb/s and an average power consumption of 69.43 mW. This circuit was expected to operate faster than 11Gb/s in the high operating frequency’s deep-submicron process. A demultiplexer has been designed with 36 transistors using 90 nm CMOS technology.⁷ Auto-generation technique and semi-custom layout design were integrated. There was an improvement in power consumption and area due to the semi-customized demultiplexer layout.

Methods

The router circuit in a decoder is a bank of MUX and DEMUX that forward the appropriate estimate terms from memory to the corresponding bit update circuit. The proposed MUX, DEMUX, bit update circuit, and proposed LDPC circuits logic simulations are executed mainly to validate the circuit’s functionality. The designed circuit had the required logic behaviour. In the layout, the memory cell’s charging and discharging were validated by the aspect ratio factor and expressed with current scaling methods. The proposed circuits were validated by reliable, optimum data of the designed parameters. Modern communication systems demand high reliability and optimum data rate, which makes the standards for future communication technology move towards methods of error correction that enable high throughput decoding with optimum performance based on the Shannon capacity.

Multiplexer (MUX)

The multiplexer is a combinational logic circuit that selects an appropriate analogue (or) digital signal from several input signals and forwards it to a single output line.¹¹ A multiplexer has several input lines and a single output line. The selection of the appropriate input is based on unique control lines called select lines. Figure 1 depicts a basic multiplexer with four inputs, I₀, I₁, I₂, I₃, and a single output line (Z). Multiplexers can be designed for a 2ⁿ number of inputs. In this design, we used a 4 × 1 MUX because it is simpler to cascade these circuits for many inputs, and the decoder was also for 4-bit data. There are two select lines, S₀ and S₁, which are the circuit’s control lines. The MUX is 4 × 1, representing four inputs and one output. An additional set of input lines control each input line’s selection according to these control input’s binary conditions, which indicated ‘HIGH’ (1) or ‘LOW’ (0). Multiplexers have an even number of 2ⁿ data input lines and some control inputs that match the number of data inputs.

Figure 1. Proposed circuit of the 4 × 1 multiplexer.

The output Z is obtained from the Boolean expansion.

(1)

Z = I_{0} \bar{S_{1}} \bar{S_{0}} + I_{1} \bar{S_{1}} S_{0} + I_{2} S_{1} \bar{S_{0}} + I_{3} S_{1} S_{0} .

The equation (1) was expanded using associative and commutative laws to obtain an appropriate and optimized circuit equation for implementing the multiplexer.¹¹ Any single input line is selected instantly depending on the combination of select lines input to be connected to the output Z. Adding more control address lines (n) allowed the multiplexer to control more inputs to switch 2ⁿ inputs. Still, each control line configuration will connect only one input to the output. In our proposed circuit, optimization of the circuit is done using pass transistor logic to design the multiplexer.

A 4 × 1 MUX was designed, as shown in Figure 1, and the input to the multiplexer in this circuit was from a bit update block (BUB), part of the LDPC decoder structure. The inputs were from the 4-bit update units used in the decoder circuit designed for this research. The multiplexer aimed to receive the updated data bits from the bit update unit and rearrange the vectors according to the circuit’s requirements.¹² The multiplexer circuit was designed using pass transistor logic. The MUX comprised NMOS and PMOS circuits for the inverters and only NMOS circuits for the remaining circuit. The inverter complemented the select input signals S₀(S_A) and S₁(S_B). The multiplexer was configured to have series-connected switches so that, based on the input combination of S₀ and S₁, one of the inputs was selected to pass the input to the output. The multiplexer passed a signal when the controlling voltage was logic low.

The circuit used NMOS because electron mobility is better than hole mobility, so the performance will be better. The inputs I₀, I₁, I₂, and I₃ fed from the 4-bit update circuits had the bit update unit’s computation values. The selection of the input given to the router was based on the selected inputs S₁ and S₀. Inputs I₀, I₁, I₂, and I₃ were chosen to connect to the output line Z. Assuming the select inputs had an input combination of S₀ = 0 and S₁ = 1. The S₀ input was fed to an inverter circuit formed by the pass transistors, which passed the value ‘0’ to the circuit, and the S₁ with a logic ‘1’ was given to the other inverter circuit. The NMOS controlled the ground and the output in one inverter circuit, while PMOS connected the input supply V_DD and the output.¹³ The transistors then did what they are best designed for, that is, the NMOS allowed a logic ‘0’, and the PMOS allowed a logic ‘1’. It acted like a 2 × 1 MUX, where the inputs are logic 0 and logic 1. The input variable acted as the control signal and determined which input should be sent to the output. Hence, combining both inverters at the input would help select the signal sent to the output. This would be either I₀, I₁, I₂, or I₃. In our example, I₂ was fed to the output Z = I₂.

Multiplexer design can be enlarged to have many more inputs using the basic multiplexer circuits. A 16 × 1 MUX can be designed using 2 × 1, 4 × 1, and 8 × 1 MUX. As per basic MUX circuit design, 4 × 1 multiplexers are used, so 16 inputs are available. Inputs I₀ to I₃ (for bits zero to three) are for the first multiplexer (to PMOS), I₄ to I₇ (for bits four to seven) to the second, and so on, where the last multiplexer has input I₁₂ to I₁₅ (for bits 12 to 15). Every multiplexer’s select inputs are combined in parallel into two main selection lines that connect all four multiplexers.¹⁴^,¹⁵ The output from each multiplexer is now fed as four inputs to another 4 × 1 multiplexer. The output from this multiplexer becomes the main output of the circuit.

Demultiplexer (DEMUX)

A demultiplexer is a combinational circuit that routes a single input line to multiple digital output lines. The demultiplexer of 2ⁿ outputs has ‘n’ select lines to select which output lines need to be connected to the input.¹³^,¹⁴ In simple terms, it is a data distributor. The demultiplexer is a 1 × 4 unit, implying a single input line Y and four output lines, D₀, D₁, D₂, and D₃. There are two select lines, S₀ and S₁. The select lines help to decide to which output line the input line Y should be connected. The select lines are controlled by the binary combination of 0 and 1. The select lines S₀ and S₁ can take on 00, 01, 10, and 11. These are the four possible combinations for two input signals and hence four possible output lines. The combination and connection of input Y to the output lines D₀, D₁, D₂, and D₃. The data input to be connected to the particular output line is obtained from the equation,

(2)

Y = {\bar{S}}_{1} {\bar{S}}_{0} D_{0} + \bar{S_{1}} S_{0} D_{1} + S_{1} \bar{S_{0}} D_{2} + S_{1} S_{0} D_{3} .

Adding more address line inputs it is possible to switch more outputs giving 1-to-2ⁿ data line outputs.¹⁶ The proposed demultiplexer was also a 1 × 4 demultiplexer constructed using pass transistor logic, as shown in Figure 2. In the figure, two inverter circuits form the input point for the DEMUX. The inverters were constructed with opposite polarity Metal Oxide Semiconductor Field Effect Transistors (MOSFETs) with their gates connected to form the input voltage V shown as S_A and S_B. The drain terminals of both MOSFETs were connected to form a typical output.¹⁷ These MOSFETS were connected in such a way (complimentary) that only one MOSFET conducts when the input has a low or high input voltage due to the complementary connection.

Figure 2. Proposed circuit of the 1 × 4 demultiplexers.

The Gate-Source voltage V_GS is equal to V_in, that is:

(3)

V_{GS} = V_{in} \dots . for NMOS

and the Source-Gate voltage given by V_SG is:

(4)

V_{GS} = V_{DD} - V_{in} \dots . for PMOS

Where V_DD is the supply voltage, the input voltage can have values from 0 to V_DD. When S_A = V_in = V_DD, the PMOS transistor gets cut off while the NMOS conducts and current flows to the ground terminal, and the output voltage is ‘0’. The ‘0’ volts are now applied to one of the inputs of transistor T5, which is in series with T6.

If input S_B had an input value of ‘0’ volts, the NMOS transistor inverter was cut off while PMOS conducted to give a path to the power supply and the output now had a value of V_DD. The second input to transistor T5 was ‘0’. The transistor had inputs 0 and 1 and gave an output ‘0’, indicating that line D_A had been selected to distribute the input from the parity check circuit of the layered decoder circuit. Hence, the other lines D_B, D_C, and D_D were selected to feed that input for other input combinations to S_A and S_B. The input fed at line D (Y in the truth table) was distributed to any four outputs represented by D₀, D₁, D₂, and D₃. The distribution was based on the select inputs S₀ (S_A) and S₁(S_B). In Figure 2, the select lines are connected to two inverters at the first stage of the DEMUX. Each inverter created the terms given in equation (2). The inverter drove the value of S₀, and if it was a ‘0’, the output could be a ‘1’, similar to the S₁ input. The following transistors drove the input to the outputs based on the bit pattern of S₁S₀.

Bit update circuit

The bit update circuit is integral to many circuits, where temporary storage and stored data updates are required periodically. These circuits have memories that will store some predetermined subset of codeword bits, though only one at a time. The circuit uses basic logic gates: the EXOR gate, a latch, and a multiplexer and inverter. It is like a loop operation, where input data bits received are fed into the multiplexer compared with the previously stored data from the latch. The EXOR gate will help identify new data and is given to the MUX, where the select inputs will ensure the new data is stored in the latch. This recently stored data is then sent to the next section of a large application circuit.

In the proposed circuit, the data input was from the DEMUX circuit, transmitting data bits received. The bit update circuit ensured that new data received was always updated and stored and then distributed through the reverse router to the parallel adder blocks in the decoder through the data bus. The bit update circuit usually works in tandem with two memories, one as an accumulator for a new data set and the other supplies the last iteration’s data.¹⁸ These two memories act in an alternating manner. A multiplexer worked like a cross switch to facilitate their alternating operation.

The proposed bit update circuit was designed using the pass transistor logic to reduce the number of transistors. The delay needed to be reduced in the circuit; hence, the technology used was adequate. The circuit shown in Figure 3 comprises a 2 × 1 multiplexer circuit with a latch. The latch acted as the temporary storage or memory for the data bits. The data bit stored in the latch was given to an EXOR gate connected to an AND gate delay circuit. This was to create a delay so that the bits reached the multiplexer within the clock pulse. The EXOR input was also fed to MUX as one of the select inputs.

Figure 3. Proposed bit update circuit.

The proposed LDPC decoder circuit

A proposed decoder architecture is described in this section, which follows the layers of component decoding. The top-level architecture is shown in Figure 4. One type of decoding technique is the layers of components decoding. It generally includes layer-by-layer processing rows of a parity check matrix.¹⁶ Each layer is processed sequentially, and the processing of each layer depends on data processed in an immediate previous layer. Decoders using the layered technique are designed to have an inbuilt latency for processing the data between layers. By explanation, say if a layer in the parity check matrix needs to be processed, data processed by a previous layer need to be received initially. But it may be that these data are unavailable yet because they are still processed in the previous layer or the data bus and have yet to reach their destination. Latency such as this has an impact on the performance of the decoder. Some problems like this need to be addressed in layered decoding methods. In the proposed circuit, improvements were made to a layered component decoding approach. The method proposed used a plurality of parallel computation blocks coupled to the memory, multiple parity check blocks connected to the computation blocks, and multiple-bit update blocks connected to the parity check block. Each bit update block had a memory. The received codeword split in this system, and at least one column/row was grouped and processed.

Figure 4. Proposed LDPC decoder architecture.

A low-density parity-check code suitable for efficient hardware implementation was designed with a belief propagation decoder circuit. Codes were arranged according to a sample H matrix whose rows and columns represented the parity check matrix. The decoder circuit had a parity check value that estimated memory, which could be arranged in groups and was logically connected to different data lengths and depths. A parallel adder generated approximate values fed to the parity check circuit. The new bitstream generated new values of estimates. These values generated were then stored back in the memory and fed to the bit update circuit. The bit update circuit then updated the new value for the subsequent input data received. Here, layered components decoding was performed by applying the decoding algorithm to each successive layer. Since no particular algorithm was developed, we used a standard to show how the improved decoder works. Applying a decoding algorithm for a particular layer included the use of calculations done in previous layers. The decoding was done using parallelized decoding hardware, and hence its performance may be better than the conventional approach.

The memory block was a local RAM for storing the estimates derived within the iteration. These estimates were stored in the memory to save the chip area. The storage memory had one output coupled to one input of the parallel adder. This was connected to the negative input of the parallel adder to provide a subtrahend for subtraction that took place in the parallel adder. The output of the parallel adder was applied to the parity check update circuitry. This block performed the updating of estimates obtained from memory for each of the parity check nodes. The output of the parity check circuit was applied back to the memory to store updated values. It was also applied to the router circuit to update the input nodes’ Log-Likelihood Ratio (LLR). The router circuitry collected multiplexers and demultiplexers that forward the appropriate estimate terms from memory to the corresponding bit update circuit. The bit update circuits were accumulators through which the current values of LLR of the input nodes were maintained from one iteration to the next iteration.

LDPC operation

The LDPC code was within a parity check matrix H of m × j values and showed a value, H. c = 0 when multiplied by the vector c, considering the Galois field, where c was the transmitted word vector. The Galois field is a finite field that contains a finite number of elements. For each row m of the parity check matrix H, the parity check could be done as H₁c₁+H₂c₂+ … H_jc_j = 0 over the Galois field. Hence, a parity check equation, the EXOR function of ‘c’, could be written considering the rows of the H matrix having a ‘1’ in their columns.

Referring to Figure 4, soft data received was routed into the decoder system through the data bus. The received data was first routed into the bit update block. Here the data was initialized into its components of a vector. Let us assume the vector for the received data as ‘L’. We defined a set where all the bit columns for a row ‘m’ and the bits in the H matrix have a one in row ‘m’. This makes the checksum for a row over a finite field. The LDPC decoder helps detect errors in the received data when checked for every row in the matrix. When data is received, the values may not be precisely binary values of 1 or 0 but some fractional values represented by several bits. Hence a probability of whether the bits are 1 or 0 can be represented using the LLR given by:

(5)

L (r_{j}) = log [\begin{matrix} P (c_{j} = 0) \\ P (c_{j} = 1) \end{matrix}] .

where r_j is the input bit value.

Every input bit arrives, the estimated value is written based on the LLR. Initially, an estimate was assumed for the LLR based on the type of channel being used.

A vector ‘R_mj’ was stored in the SRAM. These were estimates stored in the SRAM after every iteration or cycle of the decoding process and the updated value in the next iteration. The memory stores a few corresponding rows of values of R_mj, representing vector R values for m rows and j columns from a parity check matrix. For every row, the vector L was written as for the checksum:

(6)

L_{(q)} = L_{q j} ‒ R_{m j} .

The vector was then stored in the BUB. The data were fed into the reverse router block by data buses, where the data was rearranged as required by the system from the BUB. The values of the vector L were given as input to the parallel adder (PA). The other input to the parallel adder came from the memory with the values of the data stored in the form of the components of vector R. The parallel adder performed the operation approximations and subtraction of vector R from L. The results of this subtraction operation in the output ‘sum’ were given as input to the parity check circuit and the second set of parallel adders (PA2). A checksum, a sequence of numbers and letters used to detect errors introduced during data transmission, was carried out in the parity check block. The results of this operation were then fed to the second set of parallel adder blocks and the memory block for storage. In the PA2, the computation of the earlier subtraction (R) results and the checksum were added to regenerate the vector L. The new values of L were now sent to the router block to be rearranged into components of vector L. These values were given to the BUB to be stored for the next iteration.

Results and discussion

The DEMUX and MUX circuits developed here were tested as part of the decoder circuit. The results obtained after simulations at different voltage values and using 180 nm technology are highlighted below, with improvements.

Demultiplexer (DEMUX)

The 1 × 4 demultiplexers for the LDPC decoder were constructed to have one input D and four outputs, D₀, D₁, D₂, and D₃. The demultiplexer had two select inputs S₀ and S₁. The selected inputs formed the decision-maker to connect the input to a selected output. The selection was based on the four possible combinations of the select input, namely, S₀ = 0 and S₁ = 1, S₀ = 0 and S₁ = 1, S₀ = 1 and S₁ = 0, and finally, S₀ = 1 and S₁ = 1, representing the binary form 00, 01, 10, and 11. The proposed demultiplexer was simulated to check its characteristics using the Mentor graphics PADS VX.2.7 x86, a CAD tool for 180 nm technology (Open-access software that can perform an equivalent function is DSCH version 2.7for schematic design and MICROWIND version 2.0 for layout analysis).

The string of data bits was given as input D with the select inputs S₀ and S₁ varied for the four possible combinations. It should also be noted that the voltage rises and falls in Figures 5(a) to 5(c), which are not exactly zero or one. There was a signal distortion, but it showed a considerable voltage level to be read as 0 or 1. The voltage variation of 1V, 1.3V, and 1.5V did not significantly affect the output waveforms, with only a slight variation in the peak voltage values.

Figure 5(a). Voltage vs time simulation at 1V for 180 nm.

Figure 5(b). Voltage vs time simulation at 1.3 V for 180 nm.

Figure 5(c). Voltage vs time simulation at 1.5 V for 180 nm.

The waveforms shown in Figures 5(a) to 5(c) represent the distribution of bits received from the adder circuit (refer to Figure 4). The data choice is based on S₀ (S_A) and S₁ (S_B). The waveforms of D₀, D₁, D₂, and D₃ also show the effect of the gates’ switching characteristics and the peak voltage drops, which is slightly due to the capacitive effect at the input nodes. As the output voltage increases in time, the biasing voltages decrease. A decreasing value of the gate-source voltage reduces the charge density and reduces the output voltage, which does not reach V_DD.

The output voltage was seen to delay reaching the final voltage. This was due to the parasitic capacitance, the gate channel capacitance between the gate-source and gate-drain terminals. Any switching action in the device leads to the formation of parasitic capacitance. A sudden change of voltage from zero to a high value creates a capacitive effect which can be realized as an RC circuit. Resistance is created, and the device consumes more power to drive the circuit, which depicts a delay in the device’s output voltage. It creates a delay when it drives zero loads. The parasitic delay grows linearly with the number of inputs. This effect was seen in the waveforms for the demultiplexer, which displayed a slow-increasing ramp voltage. According to the simulation result, the demultiplexer area is 10.5 × 25.555 μm².

Multiplexer (MUX)

The reverse router had a multiplexer to transmit data bits from the bit update circuit to the parallel adder through the data bus. The multiplexer’s characteristic was choosing a particular input to be connected to the output. The selection of the input was based on the two select signals. In Figure 3, the schematic of the multiplexer is shown. The multiplexer had four inputs, I_A, I_B, I_C, and I_D, and a single output, Z. The select inputs were S_A and S_B. Hence the multiplexer was a 4 × 1 MUX. Since there are only two select lines, the possible input lines were four, and the possible combination was S_BS_A = 00, S_BS_A = 01, S_BS_A = 10, and S_BS_A = 11. The schematic in Figure 3 is simulated using the test bench. The 180 nm technology was used for the simulation, and voltage values of 1 V, 1.3 V, 1.5 V, and 2.5 V. Here the threshold voltage loss restricts the output voltage to the range [0V, V_DD – V_Tn].

The proposed multiplexer circuit was simulated for voltage versus time using 180 nm for input voltages of 1 V, 1.3 V, and 1.5 V, and the output waveforms are shown in Figures 6(a) to 6(c), respectively. Figures 6(a) to 6(c) show the output voltage of the selected input to be given to the output. Even though the output waveform represented the correct selected input, it delayed reaching the maximum voltage. For some inputs, it did not reach the minimum zero value. The delay caused by the inverter and the threshold voltage loss restricted the maximum voltage. Charging the output for a logic one voltage was very slow compared to the transition to a logic 0. The parasitic capacitance increased the charging time from low to high since it was diverted from the output node. The charging of the output capacitance was time-dependent and began as linear as (t/2τ_n) and then levelled out.

Figure 6(a). Voltage vs time simulation at 1V for 180 nm.

Figure 6(b). Voltage vs time simulation at 1.3 V for 180 nm.

Figure 6(c). Voltage vs time simulation at 1.5 V for 180 nm.

Since V_out(t) increases in time, the device bias voltages V_GS – V_DD – V_out (t) = V_DS decreases with time. A decreasing value of V_GS reduces the channel charge density, while smaller V_DS shows a reduction of the drain-source electric field. This indicates that passing a logic 1 voltage through the n-channel transistor is difficult. The spikes seen in the output were caused due to the capacitive coupling of the input to the output by the gate-drain capacitance. As the input suddenly increased from 0 V to V_DD, the capacitance did not have enough time to drop its voltage instantly. Hence, it would have retained some charge and is seen as voltage spikes. The proposed multiplexer circuit area is 9.9 × 32.155 μm².

The multiplexer and demultiplexer circuits were simulated using the SilTerra CEDEC pyxis project of the Mentor graphics CAD tool PADS VX.2.7 x86. The simulation environment was an input voltage value of 1 V, 1.3 V, and 1.5 V for 180 nm technology, tabulated in Table 1. The results showed a low power dissipation in nanowatts. This is because of pass transistor logic, which reduced the number of transistors used and is reflected in the results. A reduced number of transistors (12, 14) led to lower power dissipation and reduced layout area. The delay is only 80 ns and 130 ns for DEMUX and MUX, respectively.

Table 1. Results of the simulation for the multiplexer (MUX) and demultiplexer (DEMUX).

Circuit	Input voltage (V)	Power dissipation (nW)	Delay (ns)	Area (μm²)	No. of transistors
Multiplexer	1 V	1.567	80.00	268.32	12
	1.3 V	7.01	80.00	268.32	12
	1.5 V	5.14	80.00	268.32	12
Demultiplexer	1 V	1.537	139.60	318.33	14
	1.3 V	3.660	139.91	318.33	14
	1.5 V	7.067	100.50	318.33	14

Table 2 shows a comparison of the proposed circuit with various published research. It can be seen that the proposed circuit performs better. The proposed multiplexer circuit has a power dissipation of 7.067 nW, whereas Bousseaud and Negra⁷ had a value of 5 mW. The approach used by Bousseaud and Negra⁷ used a transmission gate, while pass transistor logic is used in the proposed circuit. Pass Transistor Logic (PTL) provides an advantage in the design of circuits by eliminating redundant transistors. When the number of transistors was reduced, it had a lower power dissipation as each transistor occupied some area and dissipated power. For the DEMUX circuit, the power dissipation produced by Saseendran and Mehra⁶ had a value of 142 uW; for the proposed circuit, it was 5.14 nW. The input voltage also tended to be at a lower value of 1.5 V. There was a huge difference in the number of transistors used in the design.

Table 2. Comparison of the results with other MUX/DEMUX circuits.

Reference	Circuit	Design	Technology	Area μm²	% of Improve	Power	% of Improve	Supply voltage	Technique	Transistor
Saseendran and Mehra⁶	DEMUX	1 × 4	90 nm	7482	96.41	142 uW	96.38	2 V	Adiabatic	36
Bousseaud and Negra⁷	MUX	4 × 1	65 nm	_	_	5 mW	99.99	1.2 V	Transmission gate	_
Pandey and Akashe⁸	MUX	2 × 1	90 nm	65.54	-63.68	1.38 uW	99.62	2 V	CPTL	6
Anitha and Jayachitra⁹	MUX	16 × 1	90 nm	_	_	5.23 mW	99.99	2 V	Transmission gate	162
Anitha and Jayachitra⁹	DEMUX	1 × 16	90 nm	_	_	5.23 mW	99.99	5 V	Transmission gate	162
Ahn and Kim¹⁰	DEMUX	1 × 8	0.35 μm	51200	99.47	69.45 mW	99.99	3.3 V	RMVL	_
Proposed circuit	MUX	4 × 1	180 nm	318.33	_	7.067 nW	_	1.5 V	PTL	14
Proposed circuit	DEMUX	1 × 4	180 nm	268.27	_	5.14 nW	_	1.5 V	PTL	12

Bit update circuit

The bit update circuit receives new data and then arranges them into its vectors and routes them to the multiplexer as input to the parallel adder. In each iteration of the decoder circuit, the bit update circuit restored new data values after rewriting the data received from the router circuit with data from the transmitter received through the data bus. The bit update circuit was simulated for voltage versus time using 180 nm for input voltages of 1 V, 1.3 V, 1.5 V, and 2.5 V, and the output waveforms are shown in Figures 7(a) to 7(d), respectively.

Figure 7(a). Voltage vs time simulation at 1 V for 180 nm.

Figure 7(b). Voltage vs time simulation at 1.3 V for 180 nm.

Figure 7(c). Voltage vs time simulation at 1.5 V for 180 nm.

Figure 7(d). Voltage vs time simulation at 2.5 V for 180 nm.

Figures 7(a) to 7(d) depict the output obtained for the bit update circuit. The arrival of the clock signal at the input nodes caused clock skew due to the capacitive coupling effect. At the output of 1.5V, it can be observed that the waveform has glitches. Glitches are temporary changes in the value of the output. They were caused due to the skew in the input signals to the gate of the transistor. Gate sizing, and path balancing techniques can reduce glitches. Propagation of glitches can be reduced by using a smaller number of inverters, which tend to amplify and propagate glitches. At a higher voltage of 2.5 V, the output showed a smooth and expected waveform. The area of the bit update circuit is 46.42 × 14.62 μm².

The complete proposed LDPC decoder circuit

The whole LDPC circuit was designed according to Figure 4. The components added to the test bench would be the V_DD, the ground terminal (GND), DC, and the pulse. The DC was the input voltage of 1 V, 1.3 V, and 1.5 V. We needed to set the delay (1ns), initial value (0 V), period (50 ns), the pulse value and the rise time and fall time, and the width of the pulse. The bit pattern to be run through the decoder was also specified. The proposed decoder circuits were simulated for voltage vs time effect on the output voltage for different input voltages, as shown in Figures 8(a) to 8(c). The input voltages used were 1 V, 1.3 V, and 1.5 V at 180 nm.

Figure 8(a). Voltage vs. time simulation at 1 V for 180 nm.

Figure 8(b). Voltage vs. time simulation at 1.3 V for 180 nm.

Figure 8(c). Voltage vs. time simulation at 1.5 V for 180 nm.

The carry inputs to the second set of parallel adders are also shown as check 0 to check 3. The output was measured at various points of the circuit, that is, the output of the memory unit (Vo1), the output of the adder (Vo2), the output of the parity check (Vo3), the output of the router (Vo4), and the final at the reverse router (VoF). It was observed that at the initial points of the check, the output voltage did not suffer from any signal loss. As the circuit became larger, all effects of power loss came into play due to the different circuits. At the final output (VoF), glitches were observed at regular intervals. This happened to off-pass transistors where the source and drain were initially high and then pulled low. The output of the router circuit shows the waveform reached the peak voltage but did not reach the zero line. This represents the presence of some minimum voltage that did not allow the voltage to reach zero. Practically, the drain current of a CMOS transistor does not reach zero once the voltage of the gate terminal goes below the threshold voltage.

These values are the most updated: the parity check unit block (PUCB) and the values used for the next iteration.²³ The flow of data into the circuit with the input of received data at the bit update circuit was tested with bits of data given using the rows from a standard H matrix. Every stage of the movement of the bits through each layer, namely bit update, reverse router through the data bus to parallel adder one and from the adder to the parity check block, a second set of the parallel adders, and the stored data in the memory has been simulated and outputs observed.

Tabulated results of the proposed LDPC decoder

The results of individual layers and the entire decoder are tabulated in Table 3. Various input voltages were given to observe the effect on the decoder. The decoder circuit designed achieved low power dissipation and a reasonable delay improvement.

Table 3. Results of the proposed LDPC decoder circuits (2020).

Circuit	Input voltage (V)	Power dissipation (W)	Delay (ns)	Area (μm²)	No. of transistors
SRAM	1 1.3 1.5	415.286 n 964.608 n 1.424 n	160.00 160.00 160.00	9.5 × 11.69	8
Parallel adder	1 1.3 1.5	61.06 n 295.337 n 762.28 n	60.22 120.00 119.98	7.985 × 20.765	16
Parity checker	1 1.3 1.5	4.476 n 8.774 n 14.811 n	60.025 60.009 60.008	63.200 × 18	46
Demultiplexer	1 1.3 1.5	1.567 n 7.01 n 5.14 n	80.00 80.00 80.00	10.5 × 25.555	12
Bit update circuit	1 1.3 1.5 2.5	635.006 μ 515.219 μ 3.877 n 2.12 n	57.190 79.999 79.999 40.002	46.42 × 14.62	36
Multiplexer	1 1.3 1.5	1.537 n 3.660 n 7.067 n	139.60 139.91 100.50	9.9 × 32.155	14
LDPC decoder circuit	1 1.3 1.5	3.818 n 12.950 n 68.514 n	80.073 80.021 80.023	147.505 × 122.78 = 18110.6639	982

Comparison of results

In Table 4, the obtained results for the LDPC decoder are compared and analyzed with other published work.

Table 4. Comparison of proposed circuit results with published work.

Reference	Technology	Delay (ns)	Power dissipation	Throughput	Frequency	Area
Sipos et al.¹⁹	90 nm		437.2 mW	7.92 Gbps	-	-
Senthilpari et al.²⁰	65 nm	43.2	-	3.9 Gbps	208 M	6.62 mm²
Lee et al.²¹	90 nm	-	517.7 mW	1956.5 Mbps	400 M	5.529 mm²
Bhargava et al.²²	180 nm	44.53	-	-	22.5 M	-
Proposed circuit	180 nm	80.07 80.02 80.02	3.818 nW 12.95 nW 68.51 nW	12.184 M 12.496 M 12.496 M	-	181 μm²

The proposed circuit performed better in power dissipation than the work done by Lee et al.²¹ The power dissipated by the proposed circuit is in nanowatts, while all references are in milliwatts (19, 21). This may be because the proposed circuit was designed using pass transistor logic, which reduced the number of transistors. CMOS circuits dissipate power during switching times.

Hence, reducing the switching activity reduced the power dissipation. Other studies¹⁹^–²¹ achieved 7.92 Gbps, 3.9 Gbps and 1956.5 Mbps for 90 nm, 65 nm, and 90 nm technology, respectively. The proposed circuit simulated for 180 nm obtained the throughput 12.184 Mbps, 12.496 Mbps, and 12.496 Mbps for input voltages of 1V, 1.3V, and 1.5V. The throughput didn’t give a better value because the circuit was designed to function well at lower voltages, which is a trade-off with low supply voltage, lower power dissipation, and a smaller area with throughput. We used 1V, 1.3V, and 1.5V, and performance at 1.5V were much better in power dissipation and throughput. At lower voltages, the noise margin becomes critical. The area of the proposed circuit is in nanometres squared, which is also reduced compared to Bhargava et al.²¹ (Table 4).

Conclusion

The proposed router circuit, which includes the multiplexer and demultiplexer circuits was designed using pass transistor logic. The proposed circuit gave better power dissipation and throughput performance than existing circuits due to the reduced critical path. The circuits were simulated using the Mentor Graphics CAD tool for the design and layout. The results show significant improvement in power dissipation, area, and delay. For the multiplexer, the improvement in power was 99%, but there was a difference in the technology used. The number of transistors used in the proposed circuit was also significantly reduced, which was the intention of this work. The delay obtained was 80 ns, and the area of 10.5 × 25.55 μm² for the demultiplexer and 9.9 × 32.15 μm² was considered small. The designed circuit silicon area utilization ensured reduced delay and power dissipation, making the router circuitry seemingly fitting for use in the decoder circuit. The multiplexer and demultiplexer circuits can be used in an LDPC decoder, which uses the layered approach. The multiplexer received input from the bit update block based on the state of the select inputs. The select inputs chose which data bits needed to be routed to the parallel adder block for the next iteration.

Data availability

All data underlying the results are available as part of the article and no additional source data are required.

Acknowledgements

We would like to thank the anonymous referees for their valuable suggestions. We thank our beloved Multimedia University for supporting this work.

References

1. Hassan AES, Dessouky M, Elazm AA, et al.: Evaluation of Complexity Versus Performance for Turbo Code and LDPC Under Different Code Rates. The Fourth International Conference on Advances in Satellite and Space Communications. Egypt:2012; 98–102.
2. Jahangir MZ, Mounika J: Design and simulation of an innovative CMOS ternary 3to1multiplexer and the design of ternary half adder using ternary 3 to 1 multiplexer. Microelectron. J. 2019; 90: 82–87. Publisher Full Text
3. Ueng Y-L, Yang B-J, Chung-Jay Yang C-J, et al.: An efficient Multi-standard LDPC decoder design using Hardware-friendly Shuffled Decoding. IEEE Transactions on Circuits and Systems-1. 2013; 60(3): 743–756. Publisher Full Text
4. Zhao Z, Yang K, Zheng H, et al.: Design, Simulation, and Implementation of a CMOS analog decoder (480,240) for Low-Density Parity-Check Codes. IEEE Access Journal. 2017; 5: 17381–17391. Publisher Full Text
5. Khan SA: Digital Design of Signal Processing Systems-A Practical Approach. Wiley publications;2011. Publisher Full Text
6. Saseendran TK, Mehra R: Area and Power Efficient CMOS De-multiplexer Layout on 90nm Technology. International Journal of Scientific Research Engineering & Technology (IJSRET). 2015; 102–105.2278–0882EATHD-2015 Conference Proceeding.
7. Bousseaud P, Negra R: Low power high-speed 10 Gb/s 4∶1 multiplexer for sliding-IF digital-centric transmitter in 65nm CMOS. German Microwave Conference (GeMiC), Bochum,. 2016; pp. 349–352. Publisher Full Text
8. Pandey V, Akashe S: Design Techniques for Self-Voltage Controllable Circuit on 2:1 multiplexer using 45nm Technology. Int. J. Comput. Appl. 2014; 89: 11–18. Publisher Full Text
9. Anitha K, Jayachitra R: Design, and Analysis of CMOS and Adiabatic logic using 1:16 Multiplexer and 16:1 Demultiplexer. International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering. 2015; 4(12): 8–17.
10. Ahn SH, Kim JB: Design of an 11Gb/s CMOS Demultiplexer Using Redundant Multi-Valued Logic. 8th International Conference on Solid-State and Integrated Circuit Technology Proceedings, Shanghai, China. 1580–1582.
11. Kapilachander T, Shanavas HI, Venkataraman V: Technical Study on low power VLSI methods. International Journal Information Engineering and Electronic Business. 2012; 4: 60–70. Publisher Full Text
12. Kavitha S, Hashim FH, Md. Mamun IR, et al.: Design of a low-power and high throughput error detection and correction circuit using the 4t ex-or method. Journal of Engineering Science and Technology (JESTEC). 2017; 12(8): 2010–2027.
13. McKay DJC, Neal RM: Near Shannon Limit performance of low-density parity-check codes. Electron. Lett. 1996; 32: 1645. Publisher Full Text
14. Pant D, Pandey A, Mehra R: CMOS Design of 2:1 Multiplexer Using Complementary Pass Transistor Logic. National Conference ADVICE-12, NITTTR, Chandigarh. 2012; pp 1–7.
15. Pari SC:Low Power Arithmetic Circuit Design for Multimedia Applications.Sharma M, Gautam R, Khan M, editors. Design and Modeling of Low Power VLSI Systems. Hershey, PA:IGI Global;2016; (pp. 252–282). Publisher Full Text
16. Selvam RDK, Senthilpari C, Lini L: Design of a 16-bit adder for decoder application circuit. J. Eng. Sci. Technol. (JESTEC). 2019; 249–260.
17. Selvam RDK, Senthilpari C, Lini L: Improved Speed Low Power Low Voltage SRAM Design for LDPC Application circuits. Journal of Engineering Science and Technology (JESTEC). 2018; 822–837.
18. Kavitha S, Hashim FH, Kamal N: A New Approach of an Error Detecting and Correcting Circuit by Arithmetic Logic Blocks. Int. J. Electron. Telecommun. 2018; 65(2): 313–318.
19. Sipos E, Festila L, Oltean G: Towards reconfigurable circuits based on ternary controlled analog multiplexers/demultiplexers. Proceedings of the 12th International Conference KES Part III Heidelberg, Berlin. 2008; 351–359.
20. Senthilpari C, Velrajkumar P, Diwakar JSF: Design A Low Power and High Throughput Error Detection and Data Correction Architecture by Razor II Method. PalArch's J. Archaeol. Egypt/ Egyptol. 2020; 17(9): 4393–4410.
21. Lee X-R, Chen C-L, Chang H-C, et al.: A 7.92 Gb/s 437.2 mW Stochastic LDPC Decoder Chip for IEEE 802.15.3c Applications. IEEE Trans. Circuits Syst. 2015; 62(2): 507–516. Publisher Full Text
22. Bhargava L, Bose R, Balakrishnan M: Novel hardware implementation of LLR-based non-binary LDPC decoders. 2013 National Conference on Communications (NCC), 2013. 2017; pp 1–5. Publisher Full Text
23. Ismail M, Ahmed I, Coon J: Low Power Decoding of LDPC Codes. International Scholarly ResearchNotices. 2013; 2013: 1–12. Publisher Full Text

Comments on this article Comments (0)

Version 2

VERSION 2 PUBLISHED 05 Jan 2022

Author details Author details

¹ Faculty of Engineering, Multimedia University, Cyberjaya, Selangor, 63100, Malaysia

Chinnaiyan Senthilpari
Roles: Methodology, Software, Validation, Writing – Original Draft Preparation, Writing – Review & Editing

Rosalind Deena
Roles: Conceptualization, Data Curation, Software, Writing – Original Draft Preparation, Writing – Review & Editing

Lee Lini
Roles: Formal Analysis, Investigation, Project Administration, Resources, Supervision, Writing – Review & Editing

Competing interests

No competing interests were disclosed.

Grant information

This work was supported by the Multimedia University, Grant Number MEC 21166.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Article Versions (2)

version 2

Revised

Published: 14 Nov 2022, 11:7

https://doi.org/10.12688/f1000research.73404.2

version 1

Published: 05 Jan 2022, 11:7

https://doi.org/10.12688/f1000research.73404.1

Copyright

© 2022 Senthilpari C et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

0

SEE MORE DETAILS

CITE

how to cite this article

Senthilpari C, Deena R and Lini L. Low power, less occupying area, and improved speed of a 4-bit router/rerouter circuit for low-density parity-check (LDPC) decoders [version 2; peer review: 2 approved]. F1000Research 2022, 11:7 (https://doi.org/10.12688/f1000research.73404.2)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Version 2

VERSION 2

PUBLISHED 14 Nov 2022

Revised

Views

21

Reviewer Report 15 Nov 2022

Hamed Farbeh, Amirkabir University of Technology, Tehran, Iran

Approved

https://doi.org/10.5256/f1000research.140044.r155700

My concerns have been ... Continue reading

CITE

Report a concern

Respond or Comment

Version 1

VERSION 1

PUBLISHED 05 Jan 2022

Views

27

Reviewer Report 19 Jul 2022

Hamed Farbeh, Amirkabir University of Technology, Tehran, Iran

Not Approved

https://doi.org/10.5256/f1000research.77052.r142900

The manuscript proposes an efficient circuit for parity check decoder to reduce power consumption, delay, and area of the circuit. The authors tried to optimize the (de)mux circuitry for passing the data bits through the circuit. The evaluation on 180nm ... Continue reading

The manuscript proposes an efficient circuit for parity check decoder to reduce power consumption, delay, and area of the circuit. The authors tried to optimize the (de)mux circuitry for passing the data bits through the circuit. The evaluation on 180nm technology node show considerable improvement in targeted parameters. The manuscript is fluent and the contributions are clear. My major concerns are as follows:

The evaluations are performed on outdated 180nm feature size. As we are on less than 5nm technology, how scalable and valid are the observations and improvements of this work? What about the compatibility of the proposal to very smaller technology nodes? A discussion in these directions is necessary.
The authors reported the absolute value of some parameters, but it does not make any sense when not compared to the state-of-the-art or a reference value. It is recommended to report the comparative values (maybe the percentage of improvements can help).
By “EXOR” gate, are the authors referring to well-known “XOR” gate?
Table 4 is strange to me. The authors reported their evaluation on the proposed circuit and the competitors, but they are not evaluated based on the same technology node. When comparing several designs for a circuit, their efficiency is comparable only when evaluated in the same scenario. To my understanding, the authors just used the report of each paper in the table for other schemes and did not implement them in their own experimental platform. If so, the results cannot be technically sound. I strongly recommend the authors to evaluate all schemes in the same evaluation platform.
The literature review is weak and needs to be more comprehensive. In minimum, the state-of-the-art (DE)MUXs in Table 4 should be discussed. This helps to better highlight the distinction between this work and the existing ones.

Is the work clearly and accurately presented and does it cite the current literature?

Partly
Is the study design appropriate and is the work technically sound?

Partly
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

Not applicable
Are all the source data underlying the results available to ensure full reproducibility?

No source data required
Are the conclusions drawn adequately supported by the results?

Yes

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Reliability, Fault tolerance, Internet-of-Things, Memory technology, Hardware accelerators.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

CITE

Report a concern

Respond or Comment

Views

31

Reviewer Report 17 Feb 2022

Suthendran Kannan, Department of Information Technology, Kalasalingam Academy of Research and Education, Krishnankoil, Tamil Nadu, India

Approved

https://doi.org/10.5256/f1000research.77052.r121437

I have gone through this manuscript titled “Low power, less occupying area, and improved speed of a 4-bit router/re-router circuit for low-density parity-check (LDPC) decoders’’ which is well written and formatted. Even though some points are against its quality as ... Continue reading

I have gone through this manuscript titled “Low power, less occupying area, and improved speed of a 4-bit router/re-router circuit for low-density parity-check (LDPC) decoders’’ which is well written and formatted. Even though some points are against its quality as listed below:

Abstract is well written; according to the title, this manuscript occupies less area, which has to be illustrate in the result section of the abstract
The introduction is excellent, but if the author adds a text about sections it is well and good
Is it necessary to keep tables 1 and 2? If not, remove it
Why not give an equation number for the below text equation “Gate-Source voltage V_GS is equal to V_in, that is:” Page number 4?
Figure 5 (a)- 5(c); the output waveform got a skew problem, how to get it and explain in the relevant page?
The results and explanations are good, and sufficient materials included.

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

Not applicable
Are all the source data underlying the results available to ensure full reproducibility?

No source data required
Are the conclusions drawn adequately supported by the results?

Yes

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Wireless Communication

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

CITE

Report a concern

Respond or Comment

Comments on this article Comments (0)

Version 2

VERSION 2 PUBLISHED 05 Jan 2022

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1	2
Version 2 (revision) 14 Nov 22		read
Version 1 05 Jan 22	read	read

Suthendran Kannan, Kalasalingam Academy of Research and Education, Krishnankoil, India
Hamed Farbeh, Amirkabir University of Technology, Tehran, Iran

Comments on this article

All Comments(0)

Add a comment

Sign up for content alerts

Browse by related subjects

Back to all reports

Reviewer Report

21 Views

15 Nov 2022 | for Version 2

Hamed Farbeh, Amirkabir University of Technology, Tehran, Iran

21 Views Cite this report Responses(0)

Approved

My concerns have been addressed in the revision.

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Reliability, Fault tolerance, Internet-of-Things, Memory technology, Hardware accelerators.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

27 Views

19 Jul 2022 | for Version 1

Hamed Farbeh, Amirkabir University of Technology, Tehran, Iran

27 Views Cite this report Responses(0)

Not Approved

The manuscript proposes an efficient circuit for parity check decoder to reduce power consumption, delay, and area of the circuit. The authors tried to optimize the (de)mux circuitry for passing the data bits through the circuit. The evaluation on 180nm technology node show considerable improvement in targeted parameters. The manuscript is fluent and the contributions are clear. My major concerns are as follows:

The evaluations are performed on outdated 180nm feature size. As we are on less than 5nm technology, how scalable and valid are the observations and improvements of this work? What about the compatibility of the proposal to very smaller technology nodes? A discussion in these directions is necessary.
The authors reported the absolute value of some parameters, but it does not make any sense when not compared to the state-of-the-art or a reference value. It is recommended to report the comparative values (maybe the percentage of improvements can help).
By “EXOR” gate, are the authors referring to well-known “XOR” gate?
Table 4 is strange to me. The authors reported their evaluation on the proposed circuit and the competitors, but they are not evaluated based on the same technology node. When comparing several designs for a circuit, their efficiency is comparable only when evaluated in the same scenario. To my understanding, the authors just used the report of each paper in the table for other schemes and did not implement them in their own experimental platform. If so, the results cannot be technically sound. I strongly recommend the authors to evaluate all schemes in the same evaluation platform.
The literature review is weak and needs to be more comprehensive. In minimum, the state-of-the-art (DE)MUXs in Table 4 should be discussed. This helps to better highlight the distinction between this work and the existing ones.

Is the work clearly and accurately presented and does it cite the current literature?

Partly
Is the study design appropriate and is the work technically sound?

Partly
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

Not applicable
Are all the source data underlying the results available to ensure full reproducibility?

No source data required
Are the conclusions drawn adequately supported by the results?

Yes

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Reliability, Fault tolerance, Internet-of-Things, Memory technology, Hardware accelerators.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

31 Views

17 Feb 2022 | for Version 1

Suthendran Kannan, Department of Information Technology, Kalasalingam Academy of Research and Education, Krishnankoil, Tamil Nadu, India

31 Views Cite this report Responses(0)

Approved

I have gone through this manuscript titled “Low power, less occupying area, and improved speed of a 4-bit router/re-router circuit for low-density parity-check (LDPC) decoders’’ which is well written and formatted. Even though some points are against its quality as listed below:

Abstract is well written; according to the title, this manuscript occupies less area, which has to be illustrate in the result section of the abstract
The introduction is excellent, but if the author adds a text about sections it is well and good
Is it necessary to keep tables 1 and 2? If not, remove it
Why not give an equation number for the below text equation “Gate-Source voltage V_GS is equal to V_in, that is:” Page number 4?
Figure 5 (a)- 5(c); the output waveform got a skew problem, how to get it and explain in the relevant page?
The results and explanations are good, and sufficient materials included.

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

Not applicable
Are all the source data underlying the results available to ensure full reproducibility?

No source data required
Are the conclusions drawn adequately supported by the results?

Yes

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Wireless Communication

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (0)

[1] 1. Hassan AES, Dessouky M, Elazm AA, et al.: Evaluation of Complexity Versus Performance for Turbo Code and LDPC Under Different Code Rates. The Fourth International Conference on Advances in Satellite and Space Communications. Egypt:2012; 98–102.

[2] 2. Jahangir MZ, Mounika J: Design and simulation of an innovative CMOS ternary 3to1multiplexer and the design of ternary half adder using ternary 3 to 1 multiplexer. Microelectron. J. 2019; 90: 82–87. Publisher Full Text

[3] 3. Ueng Y-L, Yang B-J, Chung-Jay Yang C-J, et al.: An efficient Multi-standard LDPC decoder design using Hardware-friendly Shuffled Decoding. IEEE Transactions on Circuits and Systems-1. 2013; 60(3): 743–756. Publisher Full Text

[4] 4. Zhao Z, Yang K, Zheng H, et al.: Design, Simulation, and Implementation of a CMOS analog decoder (480,240) for Low-Density Parity-Check Codes. IEEE Access Journal. 2017; 5: 17381–17391. Publisher Full Text

[5] 5. Khan SA: Digital Design of Signal Processing Systems-A Practical Approach. Wiley publications;2011. Publisher Full Text

[6] 6. Saseendran TK, Mehra R: Area and Power Efficient CMOS De-multiplexer Layout on 90nm Technology. International Journal of Scientific Research Engineering & Technology (IJSRET). 2015; 102–105.2278–0882EATHD-2015 Conference Proceeding.

[7] 7. Bousseaud P, Negra R: Low power high-speed 10 Gb/s 4∶1 multiplexer for sliding-IF digital-centric transmitter in 65nm CMOS. German Microwave Conference (GeMiC), Bochum,. 2016; pp. 349–352. Publisher Full Text

[8] 8. Pandey V, Akashe S: Design Techniques for Self-Voltage Controllable Circuit on 2:1 multiplexer using 45nm Technology. Int. J. Comput. Appl. 2014; 89: 11–18. Publisher Full Text

[9] 9. Anitha K, Jayachitra R: Design, and Analysis of CMOS and Adiabatic logic using 1:16 Multiplexer and 16:1 Demultiplexer. International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering. 2015; 4(12): 8–17.

[10] 10. Ahn SH, Kim JB: Design of an 11Gb/s CMOS Demultiplexer Using Redundant Multi-Valued Logic. 8th International Conference on Solid-State and Integrated Circuit Technology Proceedings, Shanghai, China. 1580–1582.

[11] 11. Kapilachander T, Shanavas HI, Venkataraman V: Technical Study on low power VLSI methods. International Journal Information Engineering and Electronic Business. 2012; 4: 60–70. Publisher Full Text

[12] 12. Kavitha S, Hashim FH, Md. Mamun IR, et al.: Design of a low-power and high throughput error detection and correction circuit using the 4t ex-or method. Journal of Engineering Science and Technology (JESTEC). 2017; 12(8): 2010–2027.

[13] 13. McKay DJC, Neal RM: Near Shannon Limit performance of low-density parity-check codes. Electron. Lett. 1996; 32: 1645. Publisher Full Text

[14] 14. Pant D, Pandey A, Mehra R: CMOS Design of 2:1 Multiplexer Using Complementary Pass Transistor Logic. National Conference ADVICE-12, NITTTR, Chandigarh. 2012; pp 1–7.

[15] 15. Pari SC:Low Power Arithmetic Circuit Design for Multimedia Applications.Sharma M, Gautam R, Khan M, editors. Design and Modeling of Low Power VLSI Systems. Hershey, PA:IGI Global;2016; (pp. 252–282). Publisher Full Text

[16] 16. Selvam RDK, Senthilpari C, Lini L: Design of a 16-bit adder for decoder application circuit. J. Eng. Sci. Technol. (JESTEC). 2019; 249–260.

[17] 17. Selvam RDK, Senthilpari C, Lini L: Improved Speed Low Power Low Voltage SRAM Design for LDPC Application circuits. Journal of Engineering Science and Technology (JESTEC). 2018; 822–837.

[18] 18. Kavitha S, Hashim FH, Kamal N: A New Approach of an Error Detecting and Correcting Circuit by Arithmetic Logic Blocks. Int. J. Electron. Telecommun. 2018; 65(2): 313–318.

[19] 19. Sipos E, Festila L, Oltean G: Towards reconfigurable circuits based on ternary controlled analog multiplexers/demultiplexers. Proceedings of the 12th International Conference KES Part III Heidelberg, Berlin. 2008; 351–359.

[20] 20. Senthilpari C, Velrajkumar P, Diwakar JSF: Design A Low Power and High Throughput Error Detection and Data Correction Architecture by Razor II Method. PalArch's J. Archaeol. Egypt/ Egyptol. 2020; 17(9): 4393–4410.

[21] 21. Lee X-R, Chen C-L, Chang H-C, et al.: A 7.92 Gb/s 437.2 mW Stochastic LDPC Decoder Chip for IEEE 802.15.3c Applications. IEEE Trans. Circuits Syst. 2015; 62(2): 507–516. Publisher Full Text

[22] 22. Bhargava L, Bose R, Balakrishnan M: Novel hardware implementation of LLR-based non-binary LDPC decoders. 2013 National Conference on Communications (NCC), 2013. 2017; pp 1–5. Publisher Full Text

[23] 23. Ismail M, Ahmed I, Coon J: Low Power Decoding of LDPC Codes. International Scholarly ResearchNotices. 2013; 2013: 1–12. Publisher Full Text

Low power, less occupying area, and improved speed of a 4-bit router/rerouter circuit for low-density parity-check (LDPC) decoders

Abstract

Keywords

Revised Amendments from Version 1

Introduction

Literature review

Methods

Multiplexer (MUX)

Figure 1. Proposed circuit of the 4 × 1 multiplexer.

(1)

Demultiplexer (DEMUX)

(2)

Figure 2. Proposed circuit of the 1 × 4 demultiplexers.

(3)

(4)

Bit update circuit

Figure 3. Proposed bit update circuit.

The proposed LDPC decoder circuit

Figure 4. Proposed LDPC decoder architecture.

LDPC operation

(5)

(6)

Results and discussion

Demultiplexer (DEMUX)

Figure 5(a). Voltage vs time simulation at 1V for 180 nm.

Figure 5(b). Voltage vs time simulation at 1.3 V for 180 nm.

Figure 5(c). Voltage vs time simulation at 1.5 V for 180 nm.

Multiplexer (MUX)

Figure 6(a). Voltage vs time simulation at 1V for 180 nm.

Figure 6(b). Voltage vs time simulation at 1.3 V for 180 nm.

Figure 6(c). Voltage vs time simulation at 1.5 V for 180 nm.

Table 1. Results of the simulation for the multiplexer (MUX) and demultiplexer (DEMUX).

Table 2. Comparison of the results with other MUX/DEMUX circuits.

Bit update circuit

Figure 7(a). Voltage vs time simulation at 1 V for 180 nm.

Figure 7(b). Voltage vs time simulation at 1.3 V for 180 nm.

Figure 7(c). Voltage vs time simulation at 1.5 V for 180 nm.

Figure 7(d). Voltage vs time simulation at 2.5 V for 180 nm.

The complete proposed LDPC decoder circuit

Figure 8(a). Voltage vs. time simulation at 1 V for 180 nm.

Figure 8(b). Voltage vs. time simulation at 1.3 V for 180 nm.

Figure 8(c). Voltage vs. time simulation at 1.5 V for 180 nm.

Tabulated results of the proposed LDPC decoder

Table 3. Results of the proposed LDPC decoder circuits (2020).

Comparison of results

Table 4. Comparison of proposed circuit results with published work.

Conclusion

Data availability

Acknowledgements

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated