ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article

Physics-Informed Neural Networks without Loss Balancing: A Direct Term Scaling Approach for Nonlinear 1D Problems

[version 1; peer review: awaiting peer review]
PUBLISHED 14 Nov 2025
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS AWAITING PEER REVIEW

This article is included in the HEAL1000 gateway.

Abstract

Physics-Informed Neural Networks (PINNs) have gained significant attention for solving differential equations, yet their efficiency is often hindered by the need for intricate and computationally costly loss-balancing techniques to address residual term imbalance. This paper introduces a direct differential equation term scaling framework that removes the loss-balancing bottleneck entirely. By scaling each term in the governing equations using characteristic physical dimensions, the proposed method ensures numerical consistency across all contributions, eliminating the need for adaptive weighting during training. This not only simplifies the PINN formulation but also improves stability and convergence. The approach is validated on challenging nonlinear one-dimensional elasticity problems, demonstrating that high-accuracy solutions can be obtained with compact neural network architectures and reducing floating-point operations by at least two orders of magnitude. A reverse scaling step restores the solution to the original physical domain, preserving physical interpretability. The results demonstrate that direct term scaling transforms PINN training into an efficient, and easily deployable process, paving the way for broader adoption in computational mechanics and other physics-driven domains.

Keywords

Compact High-Accuracy PINNs, Physics-Consistent Normalization, Loss-Balancing Elimination

1. Introduction

Current state of the art. Physics-Informed Neural Networks (PINNs)1 have emerged as a powerful and innovative approach to solving ordinary and partial differential equations (ODEs/PDEs). By integrating deep or shallow machine learning with physical principles, PINNs approximate solutions to these differential equations through optimization techniques. The process involves constructing a composite loss function that includes several components: (a) the residual of the ODE/PDE, which measures how well the solution satisfies the equation, (b) the initial conditions, which represent the state at the starting point, and (c) the boundary conditions, which describe the behavior at the edges of the domain. This method has shown great promise in various fields, such as fluid dynamics, material science, and inverse problems, because it can effectively handle complex, high-dimensional systems without relying on traditional numerical discretization methods. However, despite their potential, PINNs face a significant challenge: unbalanced loss terms. This issue can slow down convergence, reduce accuracy, and limit scalability.

The challenge of unbalanced loss terms arises from the fact that different components of the loss function frequently operate on significantly varying scales. For example, the magnitude of the residual may be substantially larger or smaller than that of the boundary condition term. This imbalance results in uneven gradients during the backpropagation process, which may lead to the optimization process favoring one term over the others. Consequently, the neural network may converge to a suboptimal solution or, in some cases, fail to converge entirely. To mitigate this issue, researchers have proposed a range of strategies, each exhibiting distinct advantages and disadvantages.

Recent advancements have explored regularization strategies and specialized network architectures aimed at enhancing the performance of PINNs. For instance, grouping regularization strategies alter the conventional loss function by implementing distinct scaling factors for each loss term, thereby ensuring that all terms are of similar magnitude and can be optimized concurrently.2 DN-PINNs3 have been designed to facilitate an even distribution of multiple back-propagated gradient components throughout the training process. By assessing the relative weights of initial or boundary condition losses in accordance with gradient norms, DN-PINNs dynamically adjust these weights to guarantee balanced training.

An extension of loss-term scaling involves adaptive weighting schemes, which adjust the weights of loss terms dynamically throughout the training process. For instance, Gaussian probabilistic models employ maximum likelihood estimation to update the weights of each loss term during each training epoch, thereby ensuring that the network concentrates on the most critical terms.4 Another notable method is the min-max algorithm, which identifies data points that present greater difficulty for training and mandates that the network prioritizes these challenging instances in subsequent iterations.5 The wbPINN method6 introduces an adaptive loss weighting strategy and a newly developed loss function that incorporates a correlation loss term and a penalty term to effectively address the interrelationships among the various loss terms.

Furthermore, weighting schemes based on gradient statistics evaluate the gradients of individual loss terms during backpropagation and make necessary adjustments to their weights, promoting balanced training7; this work has been further refined through the introduction of kurtosis-standard deviation-based weighting and combined mean and standard deviation-based schemes, both of which enhance the accuracy of solutions to partial differential equations. Improved adaptive weighting PINNs based on Gaussian likelihood estimation have been applied to solve nonlinear PDEs.8 Learning rate annealing algorithms also employ gradient statistics during training to balance the contributions of different loss terms, thus, reducing the risk of training failure.9 Another innovative approach is the Stochastic Dimension Gradient Descent (SDGD) method,10 which decomposes the gradient of the residual into smaller components corresponding to various dimensions. The SDGD method then randomly samples subsets of these components, thereby ensuring efficient optimization for high-dimensional challenges. Gradient-enhanced PINNs (gPINNs) incorporate gradient information of the PDE residual into the loss function to improve accuracy, especially for problems with steep gradients.11 Residual-Quantile Adjustment (RQA), reassigns weights based on the distribution of residuals, ensuring a more balanced training process.12

Another line of inquiry examines optimization-driven methodologies aimed at balancing loss components. For instance, the augmented Lagrangian relaxation technique converts the constrained optimization problem into a series of max-min problems, enabling the network to adaptively equilibrate each loss term.13

Numerical treatments of the PDE and tweaking the neural network architecture is another promising path. The normalized reduced-order physics-informed neural network (nr-PINN)14 converts the original PDE into a system of normalized lower-order equations. This technique employs scaling factors to mitigate gradient failures resulting from substantial PDE parameters or source functions and introduces a mechanism to automatically fulfill boundary conditions by redefining the outputs of the neural network. Integration of derivative information into the loss function has been further explored in.15 This study constructs a loss function that includes both the differential equation and its derivative, enabling the network to automatically satisfy boundary conditions without explicit training at boundary points.

Challenges & Research gap. Despite their notable success, the selected method categories still encounter various challenges. For instance, PINNs based on Gaussian likelihood estimation struggle with solutions that exhibit sharp changes or discontinuities and gPINNs often require integration with other methods to achieve optimal performance. Moreover, these methodologies heavily depend on machine learning components while often overlooking the treatment of mathematical formulations. As a result, demonstrating their efficacy typically requires complex optimization processes and extensive hyperparameter tuning, which imposes significant computational demands.

In consideration of the challenges mentioned above, this study proposes a novel approach for addressing the issue of unbalanced loss terms in PINNs by regularizing the values of the differential equation terms prior to the construction of the loss function. Contrary to existing methodologies that concentrate on adjusting weights during the training phase or modifying network architectures, our approach involves preprocessing the PDE terms to ensure that they function on comparably scaled values. This strategy alleviates the burden on the machine learning component during the optimization process, thereby enhancing convergence, accuracy, and scalability while preserving the flexibility and robustness inherent in PINNs. By bridging the existing research gap regarding the treatment of unbalanced loss terms, our methodology offers an efficient framework for solving differential equations utilizing PINNs. Furthermore, our proposed method is straightforward to implement, as demonstrated through a step-by-step application to two distinct mechanical problems: an elastic rod and an Euler beam. In both cases, we follow the exact same procedural framework, highlighting the method’s consistency and ease of use. The only variation lies in the number of variables and functions that require normalization, reinforcing the generality and adaptability of our approach across different types of differential equations. It has to be noted that the benchmark problem refer to cases with variable material and geometrical properties, which cannot be solved using traditional finite element methods.

The remainder of this paper is organized as follows: Section 2 presents the theoretical foundations of neural networks and PINNs. As these are well-established methodologies, only a brief introduction is provided, with appropriate references for further details. The section also introduces a general approach for scaling terms in differential equations, with specific examples applied to fundamental elasticity problems. Additionally, it explores extensions to more complex cases where analytical solutions are unavailable. Section 3 provides a comprehensive presentation of numerical experiments and results, covering validation cases ranging from simple models with constant properties to those with varying and nonlinear characteristics. The methodology’s efficiency is evaluated through comparisons with case studies from the literature. Finally, Section 5 summarizes the key findings, discusses current limitations, and outlines potential directions for future research.

2. Theoretical background

2.1 Neural networks and PINNs

Neural networks are computational models inspired by the structure and function of biological neural systems, specifically designed to learn complex patterns and relationships from data.16 The fundamental unit of a neural network is the neuron or perceptron, which processes an input x to produce an output ŷ through the function ŷ=σ(w·x+b) , where the parameters w and b are referred to as the weight and bias, respectively, learnt during training.17 The ‘activation function’ σ introduces nonlinearity and the network to model complex problems. Multiple neurons are clustered into groups known as layers, collectively forming a neural network ( Figure 1). Each layer may have several inputs and outputs, which are connected through an extension of the fundamental neuron equation.

(1)
ŷj=σ(∑i=1Ninxiwij+bj),i∈[1,Nout]
where Nin is the number of inputs of each neuron and Nout is the number of neurons of the layer; since each neuron produces a single output, the number of outputs for the layer is equal to the number of neurons, hence the use of Nout . The sequence of calculations from input, through the network, to the output is termed as forward pass.

09e6b857-fd74-4b48-8309-d22fc3a326f9_figure1.gif

Figure 1. The general structure of a neural network.

According to the universal approximation theorem, a neural network has the capacity to approximate any function with arbitrary precision18; however, the required connectivity of the neurons, referred to as network architecture, must be thoroughly examined. Generally, deep architectures, consisting of a multiple hidden layers, are utilized to capture hierarchical features, whereas shallow networks, which consist of fewer layers, are deployed in scenarios where data or computational resources are limited.19 The training process entails the formulation of a ‘Loss function’ to compare the outputs generated by the neural network (ŷ) against an established ground truth (y) , followed by the targeted adjustment of weights and biases.

In the context of PINNs, neural networks are extended to incorporate physical constraints directly into the training process. Unlike conventional neural networks that rely solely on data-driven learning, PINNs integrate information from governing equations and boundary conditions into the loss function, ensuring that the learned solutions adhere to underlying physical laws.1 The loss function in PINNs comprises three main components: (i) a data loss term that ensures consistency with available observations, (ii) a physics loss term that enforces compliance with differential equations, and (iii) a boundary/initial condition loss term that satisfies prescribed constraints.

PINNs can be categorized into two main types: data-driven approaches, which infer hidden relationships from experimental data, and physics-driven approaches, which directly solve differential equations while enforcing physical consistency.20,21 It has been demonstrated that the integration of physical principles into neural network architectures improves their interpretability, generalizability, and applicability across a diverse array of scientific and engineering challenges.22 This study focuses on the latter category, specifically employing physics-driven PINNs to solve differential equations while addressing challenges associated with unbalanced loss terms.

2.2 Term scaling

An effective introduction to the scaling treatment of differential equations is presented herein, based on generalized scaling methods.23 The objective of scaling is to render both the dependent and independent variables dimensionless, while simultaneously positioning them within the unit range. Each variable is normalized using a characteristic quantity relevant to the specific problem. For instance, the spatial variable may be normalized in relation to the length of a structure. Subsequently, all terms and derivatives appearing in the differential equation are expressed in terms of their dimensionless equivalents. Any unestablished scaling coefficients are approximated by requiring that the corresponding terms of the equation remain proximate to unity. Upon resolving the equation in its normalized form, a reverse process is employed to retrieve the solution in the physical domain.

The proposed method, referred to as Scaled Equation-Enhanced Physics Informed Neural Network (SEE-PINN), for handling unbalanced loss terms in PINNs, is built directly on this scaling framework. By ensuring that all terms of the differential equation operate on comparable scales before constructing the loss function, our approach simplifies implementation while improving optimization efficiency. This is demonstrated through its application to two distinct mechanical problems: an elastic rod and an Euler beam. In both cases, the exact same procedural steps are followed, emphasizing the method’s consistency and ease of use. The only variation lies in the number of variables and functions requiring normalization, reinforcing its generality and adaptability across different types of differential equations. The following paragraphs provide the numerical formulation of equations typically encountered in the literature, incorporating this systematic scaling approach.

2.2.1 Elastic rod

The response of an elastic rod is described by the 1D Poisson equation:

(2)
ddx[E(x)·A(x)·dudx]+p(x)=0,x∈[0,L]
where L is the length, E is the Young’s modulus, A is the cross-sectional area, p is the applied load and u corresponds to the pursued axial displacement. The fact that all quantities are presumed to vary arbitrarily with respect to x , prevents the derivation of general analytic solutions. Eq. (2) can be expanded as
(3)
ExAux+EAxux+EAuxx+p=0
where subscript x indicates differentiation.

To derive the normalized formulation of Eq. (3) each term is normalized using a characteristic quantity. Initially, the normalized spatial variable is defined as

(4)
x¯=xxc→x=xc·x¯→dx¯dx=1xc
where xc is the scaling coefficient. In the same sense, the Young’s modulus, the cross-sectional area, the axial displacement and the applied load can be expressed in their respective normalized forms:
(5)
E¯≡E¯(x¯)=E(x)ec→E(x)=ec·E¯(x¯)
(6)
A¯≡A¯(x¯)=A(x)ac→A(x)=ac·A¯(x¯)
(7)
u¯≡u¯(x¯)=u(x)uc→u(x)=uc·u¯(x¯)
(8)
p¯≡p¯(x¯)=p(x)pc→p(x)=pc·p¯(x¯)
where ec,ac,uc,pc are scaling coefficients to ensure that the range of E¯,A¯,u¯ and p¯ is normalized to [0,1] . The following reasonable assumptions are made for Eqs. (4)-(8):
(9)
xc=L,ec=maxE(x),ac=maxA(x),pc=max|p(x)|

For uc no assumption can be made at this point, since the value of u(x) remains unknown; consequently, it must be approximated through an alternative method.

The respective spatial derivatives appearing in Eq. (3) are expressed in normalized form using Eqs.(4)-(8):

(10)
Ex(x)=dE(x)dx=ddx[ecE¯(x¯)]=ecdE¯(x¯)dx¯dx¯dx=ecxcE¯x¯(x¯)
(11)
Ax(x)=dA(x)dx=ddx[acA¯(x¯)]=acdA¯(x¯)dx¯dx¯dx=acxcA¯x¯(x¯)
(12)
ux(x)=du(x)dx=ddx[ucu¯(x¯)]=ucdu¯(x¯)dx¯dx¯dx=uxcu¯x¯(x¯)
(13)
uxx(x)=dux(x)dx=ddx[ucxcu¯x¯(x¯)]=ucxcdu¯x¯(x¯)dx¯dx¯dx=uxc2u¯x¯x¯(x¯)

Then, Eq. (3) can be reformulated utilizing the normalized quantities:

(14)
ecxcE¯x¯·acA¯·ucxcu¯x¯+ecE¯·acxcA¯x¯·ucxcu¯x¯+ecE¯·acA¯·ucxc2u¯x¯x¯+pcp¯=0

After simplifying the equation:

(15)
E¯x¯·A¯·u¯x¯+E¯·A¯x¯·u¯x¯+E¯·A¯·u¯x¯x¯+pc·xc2·p¯ec·ac·uc=0

To confine the last term within the interval [0,1] as well, its coefficient is set equal unity, yielding the value of the normalizing parameter uc :

(16)
pc·xc2ec·ac·uc=1→uc=pc·xc2ec·ac

2.2.2 Elastic Euler beam

The response of an elastic Euler beam is described by the well-known equation:

(17)
d2dx2[E(x)·I(x)·d2wdx2]−p(x)=0,x∈[0,L]
where L is the length, E is the Young’s modulus, I is moment of inertia, p is the applied load and w is the pursued transverse deflection. All quantities are presumed to vary with respect to x , which precludes general analytic solutions. Eq. (17) can be expanded as
(18)
(ExxI+2ExIx+EIxx)·wxx+2(ExI+EIx)·wxxx+EI·wxxxx−p=0
where subscript x indicates differentiation.

To derive the normalized formulation of Eq. (18), each term is normalized using a characteristic quantity. The normalized spatial variable, Young modulus and applied load are defined again as in Eqs. (4), (5) and (8) using the same scaling coefficients as in Eq. (9). In the same sense the normalized inertial moment and transverse deflection are defined as:

(19)
I¯≡I¯(x¯)=I(x)ic→I(x)=ic·I¯(x¯)
(20)
w¯≡w¯(x¯)=w(x)wc→w(x)=wc·w¯(x¯)
where ic,wc are scaling coefficients to ensure that the range of I¯ and w¯ is normalized to [0,1] . A reasonable assumption for ic is:
(21)
ic=maxI(x)
but, again, no assumption can be made for wc , and it needs to be determined.

The respective spatial derivatives appearing in Eq. (18) are expressed in normalized form:

(22)
Ex(x)=dE(x)dx=ddx[ecE¯(x¯)]=ecdE¯(x¯)dx¯·dx¯dx=ecxc·E¯x¯(x¯)
(23)
Exx(x)=dEx(x)dx=ddx¯[ecxcE¯x¯(x¯)]·dx¯dx=ecxc2E¯x¯x¯(x¯)
(24)
Ix(x)=dI(x)dx=ddx[icI¯(x¯)]=icdI¯(x¯)dx¯·dx¯dx=icxc·I¯x¯(x¯)
(25)
Ixx(x)=dIx(x)dx=ddx¯[icxcI¯x¯(x¯)]·dx¯dx=icxc2I¯x¯x¯(x¯)
(26)
wx(x)=dw(x)dx=ddx[wcw¯(x¯)]=wcdw¯(x¯)dx¯·dx¯dx=wcxc·w¯x¯(x¯)
(27)
wxx(x)=dwx(x)dx=ddx¯[wcxcw¯x¯(x¯)]·dx¯dx=wcxc2w¯x¯x¯(x¯)
(28)
wxxx(x)=dwxx(x)dx=ddx¯[wcxc2w¯x¯x¯(x¯)]·dx¯dx=wcxc3w¯x¯x¯x¯(x¯)
(29)
wxxxx(x)=dwxxx(x)dx=ddx¯[wcxc3w¯x¯x¯x¯(x¯)]·dx¯dx=wcxc4w¯x¯x¯x¯x¯(x¯)

Then, Eq. (18) can then be recast using the normalized quantities:

(30)
E¯I¯w¯x¯x¯x¯x¯+2(E¯x¯I¯+E¯I¯x¯)w¯x¯x¯x¯+(E¯x¯x¯I¯+2E¯x¯I¯x¯+E¯I¯x¯x¯)w¯x¯x¯−pc·xc4ec·ic·wcp¯=0

In order to constrain the last term in [0,1] as well, its coefficient is set equal unity, yielding the value of the normalizing parameter wc :

(31)
pc·xc4ec·ic·wc=1→wc=pc·xc4ec·ic

3. Technical aspects and performance assessment

In this section, the architecture of the proposed SEE-PINN framework is first introduced, detailing the network structure, activation functions, training procedure, and implementation of the term-scaling approach. This ensures a comprehensive understanding of the methodology before proceeding to validation and performance evaluation.

Following this, the proposed method is validated through a series of test cases and subsequently compared to solutions found in the literature to illustrate its computational efficiency. The objective is not to undermine existing methods but to demonstrate that they can benefit from our approach and achieve enhanced accuracy and robustness.

Both simple and complex case studies are examined concerning the problems associated with the elastic rod and the elastic Euler beam to validate the proposed method. For straightforward cases, the solution obtained through PINNs is compared against analytical solutions. In more complex scenarios, where no analytical solution is available, the PINN solution is contrasted with numerical solutions.

3.1 Network architecture

Following the term scaling methodology description, a comprehensive architecture is provided here. The fundamental framework is presented for the rod and beam problems. This design is intended to be easily adaptable, enabling other researchers to extend it to various problems with minimal effort.

The proposed approach begins with a normalization step applied to the coefficients of the ODE terms before they are introduced into the input layer. These normalized coefficients, together with the neural network outputs, undergo automatic differentiation to compute the gradients of the required to form the ODE. The resulting terms are then incorporated into the loss function, which typically consists of one term for the ODE residual and additional terms for each boundary condition. In this work, the mean squared error is employed for the terms of the loss function. If the total loss converges below a predefined threshold, the process proceeds to de-normalization, producing the final solution. Otherwise, backpropagation updates the network’s weights and biases until convergence is reached.

The architecture for the elastic rod is illustrated in Figure 2 and can be readily adapted to the beam problem (or other related problems) by modifying the predicted quantities and the computed gradients. Specifically, for the beam, the output variable changes from the axial displacement u (for the rod) to the transverse deflection w , while the required gradients expand significantly. The rod problem involves computing Ex,Ax,ux and uxx , whereas the beam problem requires additional terms, namely Exx,Ix,Ixx,wx,wxx,wxxx and wxxxx . Likewise, the rod problem is solved using two boundary conditions, e.g. a Dirichlet and a Neumann condition (denoted in Figure 2 by LD and LN respectively), while the beam problem requires four boundary conditions. Although the mathematical complexity increases significantly (as detailed in the respective sections), the transition from the rod to the beam remains conceptually straightforward. The same principle applies when extending the approach to other problems.

09e6b857-fd74-4b48-8309-d22fc3a326f9_figure2.gif

Figure 2. Network architecture for the problem of the elastic rod.

The network is implemented in PyTorch, to take advantage of computational optimizations, and automatic computing (Autograd) of derivatives in the loss function.

3.2 Validations

3.2.1 Fixed uniform rod with distributed load

Statement of the problem. This represents a straightforward scenario. A homogenous rod with a uniform cross-section is subjected to an axially distributed load. The left end of the rod is fixed, while the right one is free. The axial displacement along the rod is analyzed. Numerical values are:

(32)
E=200GPa,A=1cm2,p(x)=1000x2Nm,L=1m

Since properties are constant along the rod, Eq. (2) is diminished to

(33)
EAd2udx2+p(x)=0
with boundary conditions
(34)
Displacementatx=0:u(0)=0Forceatx=L:EAux(L)=0

The analytic solution is

(35)
u(x)=−x4240000+x60000

SEE-PINN solution. An appropriate neural network is designed to approximate the displacement field of the rod. The input of the neural network is the spatial coordinate x , and its output is the predicted displacement û(x) . The network consists of only one fully connected layer, with 10 neurons, and each neuron employs the tanh activation function. The network is trained at 75 points using the Adam optimizer with learning rate 0.01 for 5000 epochs.

Comparison and error analysis. The predicted solution the PINN is validated against the analytical solution – Eq. (35). Figure 3a demonstrates that the analytic and the PINN solution are indistinguishable from each other, as verified by the parity plot in Figure 3b. The prediction quality is further assessed using the normalized relative error, where the relative error is scaled according to the magnitude of the displacement values to prevent numerical artifacts from division by very small numbers.

(36)
e(x)=|u(x)−û(x)||u(x)+O(u)|·100%

09e6b857-fd74-4b48-8309-d22fc3a326f9_figure3.gif

Figure 3. SEE-PINN vs. Analytic solution for case 3.2.1.

(a) Direct comparison of predictions. (b) Parity plot of predicted vs. reference solution.

The maximum relative error is ca. 0.4%, which demonstrates the PINN approach achieves nearly exact agreement with the analytical solution.

3.2.2 Fixed rod with variable cross-section and distributed load

Statement of the problem. This case builds upon the previous problem by introducing a nonlinear variation in the cross-sectional area.

(37)
A(x)=[2+sin(2Ï€xL)]cm2
while keeping all other parameters unchanged, increasing the complexity of the analysis.

SEE-PINN solution. The same neural network has been employed to approximate the displacement field of the rod; i.e. one fully connected layer, with 10 neurons, and each neuron employs the tanh activation function. The network is trained at 75 points using the Adam optimizer with learning rate 0.01 for 5000 epochs.

Comparison and error analysis. The predicted solution is compared against a numerical reference for validation. Figure 4 illustrates that the analytic and the PINN solution are again indistinguishable. With a maximum relative error of approximately 0.04%, the PINN approach demonstrates an excellent agreement with the analytical solution.

09e6b857-fd74-4b48-8309-d22fc3a326f9_figure4.gif

Figure 4. SEE-PINN vs. Analytic solution for case 3.2.2.

(a) Direct comparison of predictions. (b) Parity plot of predicted vs. reference solution.

3.2.3 Fixed rod with variable Young’s modulus and cross-section, and distributed load

Statement of the problem. The complexity is further increased by incorporating a nonlinear variation in the Young’s modulus of the rod.

(38)
E(x)=200·(1−tanhx)GPa
while preserving the other parameters; the cross-sectional area still varies according to Eq. (37). This variation is designed to mimic the behavior of exotic materials, similar to those found in advanced metamaterials, or the properties of damaged materials.

SEE-PINN solution. The same neural network has been employed to approximate the displacement field of the rod; i.e. one fully connected layer, with 10 neurons, and each neuron employs the tanh activation function. The network is trained at 75 points using the Adam optimizer with learning rate 0.01 for 5000 epochs.

Comparison and error analysis. The predicted solution validated against is validated against a numerical reference. As shown in Figure 5, the analytical and PINN solutions are virtually identical. The maximum relative error is approximately 0.05%, indicating that the PINN approach achieves an almost exact match with the analytical solution.

09e6b857-fd74-4b48-8309-d22fc3a326f9_figure5.gif

Figure 5. SEE-PINN vs. Analytic solution for case 3.2.3.

(a) Direct comparison of predictions. (b) Parity plot of predicted vs. reference solution.

3.2.4 Uniform Euler beam with distributed load

Statement of the problem. The second part of the validation examines three beam problems. The first case considers a uniform, homogeneous elastic beam with a length of L=1 m. The beam has a rectangular cross-section of b=6cm,h=1cm and is composed of a material with E=200GPa . The left end of the rod is fixed, while the right one is simply supported. The beam is subjected to a transverse load of p(x)=100N/m , and its transverse deflection is analyzed.

This problem can be easily solved through the analytical solution of Eq. (17) with constant properties and boundary conditions:

(39)
Deflectionatx=0:w(0)=0Rotationatx=0:w′(0)=0Deflectionatx=L:w(L)=0Momentatx=L:EIw′′(L)=0

The analytic solution is:

(40)
w(x)=x4240−x396+x2160

SEE-PINN solution. An appropriate neural network is designed to approximate the deflection of the beam. The input of the neural network is the spatial coordinate x , and its output is the predicted transverse displacement ŵ(x) . The network consists of only one fully connected layer, with 10 neurons, and each neuron employs the tanh activation function. The network is trained at 75 points using the Adam optimizer with learning rate 0.001 for 10000 epochs.

Comparison and error analysis. The predicted solution is validated against the analytical reference ( Figure 6), showing that the analytical and PINN solutions are virtually identical ( Figure 6a); this is further supported by the parity plot ( Figure 6b), where all data points practically lay along the diagonal. With a maximum relative error of approximately 0.2%, the PINN approach achieves exceptional accuracy in comparison to the analytical solution.

09e6b857-fd74-4b48-8309-d22fc3a326f9_figure6.gif

Figure 6. SEE-PINN vs. Analytic solution for case 3.2.4.

(a) Direct comparison of predictions. (b) Parity plot of predicted vs. reference solution.

3.2.5 Euler beam with variable cross-section and distributed load

Statement of the problem. This case builds upon the previous problem by introducing a nonlinear transition in the inertial moment of the cross-section, represented by I :

(41)
I(x)=I0·[1+0.5x−0.25x2],I0=b·h312
while keeping all other parameters unchanged, increasing the complexity of the analysis.

PINN solution. A shallow architecture has been employed to approximate the deflection of the beam; i.e. one fully connected layer, with 10 neurons, and each neuron employs the tanh activation function. The network is trained at 75 points using the Adam optimizer with learning rate 0.001 for 10000 epochs.

Comparison and error analysis. The predicted solution is compared against a numerical reference for validation. Figure 7 illustrates that the analytic and the PINN solution are again indistinguishable. With a maximum relative error of approximately 0.35%, the PINN approach demonstrates an excellent agreement with the analytical solution.

09e6b857-fd74-4b48-8309-d22fc3a326f9_figure7.gif

Figure 7. SEE-PINN vs. Numerical solution for case 3.2.5.

(a) Direct comparison of predictions. (b) Parity plot of predicted vs. reference solution.

3.2.6 Euler beam with variable Young’s modulus and cross-section, and distributed load

Statement of the problem. The complexity is further enhanced by introducing a nonlinear variation in the Young’s modulus of the beam,

(42)
E(x)=200·(1−0.25x−0.5x2)GPa
while keeping all other parameters unchanged. The cross-sectional area continues to vary according to Eq. (41).

SEE-PINN solution. The same neural network has been employed to approximate the deflection of the beam; one fully connected layer, with 10 neurons, and each neuron employs the tanh activation function. The network is trained at 75 points using the Adam optimizer with learning rate 0.001 for 10000 epochs.

Comparison and error analysis. The predicted solution is compared against a numerical reference for validation. Figure 8 illustrates that the analytic and the PINN solution are again indistinguishable. With a maximum relative error of approximately 0.4%, the PINN approach demonstrates an excellent agreement with the analytical solution.

09e6b857-fd74-4b48-8309-d22fc3a326f9_figure8.gif

Figure 8. SEE-PINN vs. Numerical solution for case 3.2.6.

(a) Direct comparison of predictions. (b) Parity plot of predicted vs. reference solution.

3.3 Performance

The performance of the introduced methodology is assessed by comparison to existing models. The objective is to demonstrate that the suggested approach yields the same solution while utilizing significantly fewer computational resources. Given that the precise technical details of each study are not known, theoretical estimates of computational requirements and complexity have been derived from the respective network architectures.

3.3.1 Performance metric

This analysis evaluates the floating point operations (FLOPs) required for both the forward and backward passes as a key metric for assessing the performance of a neural network. While other metrics, such as the memory needed to store weights, biases, and intermediate results, could also be considered, a more detailed examination of these factors is beyond the scope of this paper.

According to Eq. (1), a neuron in fully connected layer performs three basic operations: (a) multiplication of every input with a weight, i.e. Nin operations, (b) summation of all input-weight products, i.e. Nin−1 operations, and (c) addition of a bias, i.e. 1 operation. Thus, for a layer with Nin inputs an Nout outputs, the required number of FLOPs for a forward pass is:

(43)
FFWDL(Nin,Nout)=[Nin+(Nin−1)+1]·Nout=2NinNout

The backpropagation process is more complex than the forward pass and is assumed to require three times as many FLOPs; FBKD=3FFWD . For simplicity, the computations needed for activation functions are considered minimal, so any additional overhead calculations—which may vary by implementation—are not included. Therefore, the total computational load is calculated by multiplying the total number of FLOPs required for both the forward pass and backpropagation by the number of training points and the number of epochs.

3.3.2 Case studies

Case 1.

Wang et al.24 have conducted simulations on a 10 m long homogeneous rod featuring a cross-sectional area of 1 m2, composed of a material characterized by a Young’s modulus of 175 Pa. The rod was fixed at both ends and subjected to a distributed load:

(44)
b(x)=−4π2(x−2.5)2−2πeπ(x−2.5)2−8π2(x−7.5)2−4πeπ(x−7.5)2

The authors addressed the problem by utilizing a PINN comprising 6 hidden layers, with each layer containing 512 neurons employing the ReLU activation function. The model was trained for Ne=50,000 epochs using Np=100 data points.

The computational cost for a forward pass is the total sum of FLOPs for the input layer, the six hidden layers, and the output layer, i.e.

(45)
FFWD=FFWDL(1,512)+6FFWDL(512,512)+FFWDL(512,1)=[2·1·512]+6·[2·512·512]+[2·512·1]=3147776FLOPs

When including the cost of back-propagation and considering the number of training points, the total computational cost becomes:

(46)
Ftot=(FFWD+FBKD)·Np·Ne=4FFWD·Np·Ne≈63TFLOPs

The solution was subsequently validated against an analytical benchmark solution:

(47)
u(x)=1EA·(e−π(x−2.5)2−e−6.25π)+2EA(e−π(x−7.5)2−e−6.25π)

The same problem was solved using the proposed approach but with a significantly smaller network, specifically two layers containing 20 neurons each, trained on 100 points for 30,000 epochs. A comparison of the predictions with the provided analytical solution in Figure 9 shows excellent agreement.

09e6b857-fd74-4b48-8309-d22fc3a326f9_figure9.gif

Figure 9. SEE-PINN vs. Analytical solution for case study 1.

(a) Direct comparison of predictions. (b) Parity plot of predicted vs. reference solution.

The computational cost for a forward pass in this configuration is given:

(48)
FFWD=FFWDL(1,20)+2FFWDL(20,20)+FFWDL(20,1)=[2·1·20]+2·[2·20·20]+[2·20·1]=1600FLOPs
and the total cost is Ftot=20.16GFLOPs which is three orders of magnitude lower than the original approach.
Case 2.

Singh et al.25 simulated the bending of a 1 m long homogeneous Euler beam with a moment of inertia I = 1.0, made of a material with Young’s modulus of 1.0 Pa. The beam was fixed at one end and subjected to a distributed load:

(49)
p(x)=1−x

To solve this problem, the authors employed a PINN with 5 hidden layers, each containing 50 neurons using the tanh activation function. The model was trained for Ne=300 epochs using Np=51 data points.

The computational cost for a forward pass is the total sum of FLOPs for the input layer, the five hidden layers, and the output layer, i.e.

(50)
FFWD=FFWDL(1,50)+5FFWDL(50,50)+FFWDL(50,1)=[2·1·50]+5·[2·50·50]+[2·50·1]=25200FLOPs

Taking into account the cost of back-propagation, the number of training points and the number of epochs, the total computational cost is given by:

(51)
Ftot=(FFWD+FBKD)·Np·Ne=4FFWD·Np·Ne≈1.54GFLOPs

The solution was validated against an analytical benchmark solution:

(52)
w(x)=−x5120+x424

The same problem was solved using the proposed approach but with a significantly smaller network: a single hidden layer with 10 neurons, trained on 75 points for 1,000 epochs. As shown in Figure 10, the predictions closely match the analytical solution, demonstrating excellent agreement.

09e6b857-fd74-4b48-8309-d22fc3a326f9_figure10.gif

Figure 10. SEE-PINN vs. Analytical solution for case study 2.

(a) Direct comparison of predictions. (b) Parity plot of predicted vs. reference solution.

The computational cost for a forward pass using the SEE-PINN configuration is given:

(53)
FFWD=FFWDL(1,10)+1FFWDL(10,10)+FFWDL(10,1)=[2·1·10]+5·[2·10·10]+[2·10·1]=240FLOPs
and the total cost is Ftot=72MFLOPs which is two orders of magnitude lower than the original approach. Performance efficiency is visually demonstrated by comparison in Figure 11.

09e6b857-fd74-4b48-8309-d22fc3a326f9_figure11.gif

Figure 11. Performance comparisons between SEE-PINN and SOA models in literature (Lower is better).

The proposed approach requires 2-3 times less FLOPs.

4. Summary and discussion

The proposed methodology offers an efficient and streamlined approach for solving ordinary differential equations (ODEs) with Physics-Informed Neural Networks (PINNs) by directly scaling the terms of the governing equations, rather than introducing balancing weights within the loss function. Each term is normalized using characteristic physical dimensions, bringing all contributions to a similar order of magnitude close to unity. This ensures numerical consistency and eliminates the need for complex and computationally intensive loss-balancing procedures. The scaled equations are solved within a PINN framework, after which a reverse scaling step restores the solution to the physical domain.

The method has been demonstrated through nonlinear one-dimensional elasticity problems, including rod and Euler–Bernoulli beam cases. The results show that high accuracy can be achieved with extremely compact network architectures – even a single hidden layer with ten nodes – while maintaining negligible maximum percentage error across collocation points. Benchmarking against existing PINN approaches reveals that the proposed scaling strategy reduces floating-point operations (FLOPs) by at least two orders of magnitude, underscoring its potential to deliver substantial computational savings without compromising precision.

While promising, the method also presents opportunities for further development:

  • 1. Optimal Hyperparameter Selection: Automated, self-tuning strategies remain an open research goal to avoid case-by-case manual tuning.

  • 2. Extension to Higher Dimensions: Applying the methodology to 2D and 3D problems, where term coupling increases complexity, is a priority.

  • 3. Highly Nonlinear and Discontinuous Cases: Future work will target problems with sharp gradients, contact conditions, discontinuities, and dynamic effects.

  • 4. Time-Dependent Problems: These can be addressed by treating time as an additional dimension or by adopting time-aware neural architectures such as LSTMs.

In conclusion, the proposed scaling-based PINN framework (SEE-PINN) demonstrates that direct differential equation term scaling can fundamentally simplify and accelerate the training of PINNs for nonlinear problems. By completely removing the reliance on elaborate and costly loss-balancing mechanisms, it enables the use of compact, fast, and accurate models that are easier to deploy in real-world engineering settings. The combination of high accuracy, drastic computational savings, and straightforward implementation positions SEE-PINN as a practical and scalable tool, with the potential to reshape how machine learning is applied to challenging differential equation problems in computational mechanics and beyond.

Ethics and consent

Ethical approval and consent were not required.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 14 Nov 2025
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Theodosiou T and Rekatsinas C. Physics-Informed Neural Networks without Loss Balancing: A Direct Term Scaling Approach for Nonlinear 1D Problems [version 1; peer review: awaiting peer review]. F1000Research 2025, 14:1252 (https://doi.org/10.12688/f1000research.169129.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status:
AWAITING PEER REVIEW
AWAITING PEER REVIEW
?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 14 Nov 2025
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.