Applications of physics ‐ informed neural networks for property characterization of complex materials

The characterization of in ‐ place material properties is important for quality control and condition assessment of the built infrastructure. Although various methods have been developed to characterize structural materials in situ, many suffer limitations and cannot provide complete or desired characterization, especially for inhomogeneous and complex materials such as concrete and rock. Recent advances in machine learning and artificial neural networks (ANN) can help address these limitations. In particular, physics ‐ informed neural networks (PINN) portend notable advantages over traditional physics ‐ based or purely data ‐ driven approaches. PINN is a particular form of ANN, where physics ‐ based equations are embedded within an ANN structure in order to regularize the outputs during the training process. This paper reviews the fundamentals of PINN, notes its differences from traditional ANN, and reviews applications of PINN for selected material characterization tasks. A specific application example is presented where mechanical wave propagation data are used to characterize in ‐ place material properties. Ultrasonic data are obtained from experiments on long rod ‐ shaped mortar and glass samples; PINN is applied to these data to extract inhomogeneous wave velocity data, which can indicate mechanical material property variations with respect to length.


Introduction
Real-world engineering and science phenomena are complex, yet they can be simplified under some assumptions and represented using appropriate differential equations. The most appropriate and accurate differential equation for a given phenomenon or behavior is called its governing equation. This type of approach represents a "physics-based" model, as opposed to a "data-driven" model that does not consider any underlying physics or mathematics in its solution. Here we present several important representative differential equations that serve as governing equations to simulate processes and behavior in the construction materials field: Fourier's law is used to model heat transfer problems; Fick's law is used to model transport or diffusion of ions through a material; Darcy's law is widely used to model fluid flow in porous media; and Euler's laws of motion are used to describe mechanical responses of materials. Although physics-based models may not perfectly represent a specific real-word problem because of its complexity, they are effective and broadly used because of the flexibility provided by coefficient changes to account for varying environmental or material conditions. When modeling engineering problems, it often is important to understand or predict those coefficients within the governing equations. For example, those coefficients can represent material properties (e.g., density, elastic modulus, thermal expansion coefficient, specific heat, permeability, viscosity, etc.) that are important for fundamental research and engineering design, and furthermore may be used as criteria for the evaluation of material integrity. Such material characterization is more difficult with complex, inhomogeneous, or mechanically nonlinear materials such as cement, concrete and rock where properties vary as a function of space, time, stress state or ambient environment; as a result, purely physics-based models often fall short in application to construction materials.
The popularity of data-driven models has increased recently because of the development of machine learning tools, access to increased computational power, and ease of data collection [1]. A key machine learning tool is the artificial neural network (ANN) [2,3], which has universal approximation capability [4][5][6] and has demonstrated extraordinary results in image classification [7,8], time series regression/prediction [9][10][11], and natural language processing [12][13][14] applications. For example, the performance of general object detection has improved rapidly over time as shown in Figure 1 (a). Even for complicated board games (e.g. Go and Chess), the performance of ANNs is already superior to human ability [15]. At the same time, the size and complexity of ANN models have continuously increased: the total number of parameters of noted ANN models has increased dramatically over time as shown in Figure 1 (b). Despite these developments, ANNs still exhibit multiple drawbacks: (1) they represent "black box" computations with no understanding of internal processes, (2) they require tremendous amount of training data, (3) they demand high computing power requirements, and (4) they show comparatively poor performance for unseen data (i.e., generalization errors or overfitting problems) [16]. In order to address increasingly complex and difficult engineering and science problems, models that combine or fuse physics-based and data driven approaches have been explored more recently. One example of this approach is a "digital twin," which is a model that represents a real-world problem with additional virtual characteristics [18,19]. There are three main components of a digital twin: (1) physical object, (2) virtual object, and (3) connection between two objects. A distinct characteristic of the digital twin model is a seamless connection between the physical object and the virtual object. Rather than relying on data or governing equations, it is continuously updated with additional collected data. It can thereby provide a more accurate and versatile representation of the physical object. A more direct combination of physics-based and data-driven models is represented by physics-informed neural networks (PINN). PINN is a relatively new technique among machine learning/neural networks [20], although it is not a new concept [21,22]. PINN is a type of ANN that includes physicsbased equations, usually differential equations, as prior knowledge within the training/prediction process, while conventional ANNs do not use prior knowledge or laws of physics about the prediction target. Therefore, parameters of conventional ANNs (i.e., weights and bias) are learned only through training data. PINNs differ from conventional ANNs in that they contain governing equations, in the form of differential equations, and consider compatibility between the equations and the training data. A more detailed explanation of PINN will be provided in Section 2. Although PINNs are starting to be applied to solve challenging problems across a broad range of engineering fields, applications in civil engineering, and construction materials in particular, are limited [23,24].
The aim of this technical letter is to introduce the basic structure and function of artificial neural networks (ANN) and physics-informed neural networks (PINN) (Section 2) and to highlight the potential of PINN to contribute to material characterization tasks by exploring selected PINN-related research work (Section 3) and finally by applying PINN to specific wave propagation-based experimental data (Section 4). In the experimental work, inhomogeneous material property variation is characterized by predicting the wave velocity over space. Wave velocity is often used as a material characterization parameter as it serves as a proxy for material compliance because material Young's modulus is proportional to the square of the wave velocity.

Artificial neural networks and physicsinformed neural networks Structure and function of artificial neural networks
The basic structure of an ANN is shown in Figure 2. ANNs consist of multiple layers of neurons, where the first layer is called the input layer and the last layer is called the output layer. The layers in between the input and the output layers are called hidden layers. ANNs predict an output(s) based on input data utilizing the hidden layer to do so. Figure 2 (b) shows detail about the function of one neuron within the hidden layer. Each neuron is represented by weights and biases, often called neural network parameters. Once input data are passed to a neuron, the neuron's weights are multiplied by the input data and summed and the bias is added to the result. After that, the result is fed into an activation function, which is a characteristic feature of ANNs. This process is expressed as where is the input vector, and the weight vector and bias of a neuron, respectively, the activation function, the argument of an activation function, and the output from the neuron. Usually a nonlinear activation function is used, which can take the form of a hyperbolic tangent, rectified linear unit, or sigmoid function; the choice of a specific function depends on the application. An ANN can also be expressed as (2) where represents the output values from the ANN (also known as the predicted values with respect to target values) and the ANN expressed as a function. Throughout this paper, the hat symbol indicates predicted values. ANN parameters ( ) are usually learned using gradient descent algorithms (e.g., stochastic gradient descent (SGD), SGD with Momentum, Adam, etc.) [30][31][32] in an iterative training process with the objective of minimizing the error between the outputs of the ANN and the training outputs. This is presented as where is the learning rate, learnable parameters such as weight and bias in neurons, the cost function, ∇ the gradient with respect to , and the iteration number. The cost function (also called the loss function) is the function that quantifies the error between model outputs and expected outputs. The choice of the cost function depends on the application and structure of the ANN. Typically, mean square error (MSE) and cross-entropy functions are preferred for regression and classification applications, respectively. An example cost function when MSE is used is given by where is the total number of training data, the training output data corresponding to the training input data or latent (or true) function, and the output from an ANN model corresponding to the training input data .

Physics-informed neural networks
PINN is a form of ANN where physics-based equations (most usually in the form of differential equation) are embedded directly within ANN structure; these equations act as regularization agents during the training process. In other words, the parameters in PINN are tuned to comply with the embedded physics-based governing equations. Because PINNs are closely related to differential equations, we define a differential equation in general terms as where • is a differential operator, the dependent variable, and the independent variable. If contains spatial and temporal data then it can be rewritten as where is spatial coordinate data, and temporal data and Ω denotes the spatial domain. Spatial and temporal data are separated intentionally here to distinguish spatial and spatiotemporal problems. Partial differential equations (PDEs) usually contain initial and boundary conditions to ensure the existence and uniqueness of the solution, for example where is time, the upper bound of time-domain data, and the dimension of the spatial data, and ℎ and are arbitrary functions. Eqs. (6-1) and (6-2) represent initial condition (IC) and boundary condition (BC), respectively. a b The main characteristic of PINN is that physics-based equations are embedded as prior knowledge; a conventional ANN does not use any prior knowledge about data to be trained or the target, so it is a fully data-driven model. In the case of conventional ANNs, all parameters (i.e., weights and bias of each neuron) are randomly initialized and learned (or tuned) to minimize the specific cost function that is defined. Because conventional ANNs are fully data-driven models, they usually require a great amount of training data to adapt to a generalized problem. On the other hand, PINNs require less training data when they are used for inverse problem solving [33,34], and in some cases they do not need any training data when deployed for forward problem solving. Furthermore, PINNs are more robust and generic than traditional ANNs [35]. To achieve a specific objective using an ANN, two different approaches are typically used. The first is to build or design a specific neural network architecture, for example a convolutional neural network [36,37], recurrent neural network [38], or long short-term memory network [11,39]. The second approach is to choose an appropriate loss or cost function. Most of PINN applications take the second approach. The physics-based equations are incorporated in the cost function as where ℒ represents loss terms and ℒ is the total number of loss terms in the cost function. The number of loss terms varies depending on application. This cost function is different from that shown in Eq. (4) that is used in a conventional ANN. Typical loss terms used in PINN are given by where ℒ ℛ is the loss from PDE residual, ℒ the loss between observed data and output values from PINN, ℒ ℐ the loss from initial condition, ℒ ℬ the loss from boundary condition, and ℛ , , ℐ , ℬ are the number of training data used in each respective loss term. Note that not all PINN applications use the full set of loss terms listed here. The form of the residual (ℛ) depends on the embedded equations in the PINN model.
For example, if the embedded equation is the 2-D Laplace equation, then the residual is given by The input data ( ) used in Eq. (9), also called the collocation points, must satisfy the physics-based equation. To calculate the residual the dependent variable, which is the output of PINN ( ), is differentiated with respect to independent variables, which are the input of PINN ( ). This differentiation is carried out using Automatic differentiation [40,41]. * represents the given target values. Depending on the application, the target values ( * ) can be either , which is a true function without error, or , which is the summation of measurement and unknown error.
PINNs can be used to solve a PDE, also known as the forward problem, or to predict coefficients within embedded governing equations, also known as the inverse problem; more specifically, the inverse problem finds or extracts features or model parameters from observed or measured data. When PINNs are used to solve the forward problem, we consider it to be "semi-supervised" learning because (1) the training process requires both labeled and unlabeled data and (2) a significant portion of the data is unlabeled data where labeled data are readily obtained. When PINNs are used to solve inverse problems, labeled data are required so it is considered "supervised" learning. PINNs offer advantages over conventional numerical techniques for the forward problem: it is mesh-free so there is no discretization error and the challenge of dimensionality is avoidable. Considering that solving inverse problems is not trivial using conventional methods, PINNs may provide effective solution approaches.

Material characterization related applications
Although the application of PINN in the construction materials field is not yet common, several application studies have been conducted. In this section, selected papers that are directly or indirectly related to civil engineering or construction materials fields are reviewed. First, fluid flow (or, diffusion) related applications are considered. Tartakovsky et al. [42] studied flow in porous media using 2-D steady-state linear and nonlinear diffusion equations where and are hydraulic conductivity and hydraulic head or capillary pressure, respectively. The main objective of Tartakovsky's study was to predict full and fields from individual sparse measurements of and . Essentially, Eq. (10-1) describes saturated flow in an inhomogeneous medium while Eq. (10-2) describes unsaturated flow in a homogeneous medium.
To solve and predict the linear diffusion equation (Eq. 10-1), the authors used two separate ANNs to predict and individually. Each ANN had 3 hidden layers and 50 neurons per layer. The number of hidden layers and neurons is usually determined empirically, which was the case with Tartakovsky's work. The training data (or reference data) were generated using the finite volume method. In addition to 1024 uniformly distributed collocation points, 250 points of hydraulic conductivity and 100 points of capillary pressure were randomly selected and used for the training process. The relative L 2 errors were 1.7 and 0.5 % for and , respectively. The relative L 2 error is defined as ‖ ‖/ ‖ ‖ where is the quantity of interest and ‖. ‖ the two-norm. When they solved the nonlinear diffusion equation (Eq. 10-2), the reference data were generated using a numerical solver (Subsurface Transport Over Multiple Phases (STOMP) [43]). In this case, hydraulic conductivity values were not provided and the relative L 2 error were less than 1 % for both and . Additionally, the authors found that the initial parameters (weights and basis) of the ANN model have only minor effects on prediction results. Fuks and Tchelepi [44] solved a forward problem concerning twophase transport in porous media using the 1-D Buckley-Leverett model [45]. They well estimated water saturation when there was no abrupt discontinuity (i.e., shock behavior) in the data; however, they had difficulty in finding a solution in the presence of a shock. By adding a diffusion term in the model equation, the problem was alleviated. Yu et al. [46] solved the diffusion-reaction equation using gradientenhanced PINN (gPINN). The diffusion-reaction equation describes substance changes over time and space considering reaction and diffusion. gPINN makes use of the fact that the derivatives of the PDE residual (ℒ ℛ ) should be zero, which is considered in the cost function (Eq. (7)). The results were compared with the analytic solution and the relative L 2 error for the solution was less than 1%. In addition, an inverse problem was solved using a steady-state diffusion-reaction equation with a space-dependent reaction rate term. The reaction rate function was well predicted and gPINN showed better performance than a conventional PINN. Now we consider studies related to heat transfer. He et al. [47] studied 1-D heat conduction using PINN. The governing equation that they used is , 0, where is the temperature and , the heat source function. At first, two forward problems were solved: respectively Dirichlet and Neuman boundary problems. When the coefficients of the governing equation, and , were set to 1, the solution from PINN well matched the analytic solution. However, when wood and steel material properties were used, the conventional PINN model produced large errors. After normalizing the time-domain data and appropriately scaling the physics-based equation, reasonably good results were obtained when compared with numerical simulation results (ABAQUS [48]). The authors claim that data normalization alleviates the gradient vanishing problem. Next, three inverse problems were carried out. The first inverse problem predicted the constant coefficient in Eq. (11) given and , . The second predicted the two constant coefficients and in Eq. (11) given , . The third predicted the source term in Eq. (11), which is , given and . The authors showed that two techniques help to increase PINN's performance: the skip connection used in ResNet [49] and the adaptive activation function [50]. Cai et al. [51] solved convective heat transfer problems using PINN. In their model, the convection-diffusion equation and incompressible Navier-stokes equation were used. In their study the governing equations were given but only partial boundary conditions were provided, which represents an ill-posed problem. In their solution, sparse measurements of temperature and velocity were provided to a PINN to predict temperature, velocity, and pressure fields. Rad et al. [52] used PINN to solve an alloy solidification modeling problem. In their paper, the model consists of the energy conservation equation, solute conservation equation, and thermodynamic relations. The outputs of the PINN model were temperature, solid fraction, and solute concentration. The temperature predictions were compared with the results from an open-source computational fluid dynamics software platform (OpenFOAM). The solid fraction and solute concentration predictions were compared with the exact analytical solutions, where the PINN predictions matched the analytical solutions well. The authors also analyzed the optimal range of initial learning rate for the Adam optimizer for this problem.
Finally, we consider solid mechanics related materials studies. Haghighat et al. [53] predicted Lamé parameters (typically represented as and ) for a 2-D homogeneous elastic plane-strain problem, which represents an inverse problem. The training data were obtained from an analytic solution where displacement, stress, and force were used as training data sets. They applied PINN to solve this nonlinear problem based on the von Mises elastoplastic constitutive model, where yield stress and Lamé parameters were predicted. Bharadwaja et al. [54] solved a 2-D inhomogeneous and linear elasticity problem. They considered an inhomogeneous material where either internal voids or high elastic modulus inclusions were included within a base solid material. The PINN framework 'Modulus,' developed by Nvidia, was used to create PINN models. As Tartakovsky et al. [42] did, two separate ANN models were used to predict displacement and stress separately. The stress predictions from the PINN model matched well with results obtained by a commercial finite element software platform.

Specific application related to material characterization using mechanical wave propagation
In this section, we demonstrate how PINN can be used to characterize material properties using mechanical wave propagation data.
The well-known linear wave equation, which governs mechanical wave propagation in solids, is (12) where is displacement, space, and the propagating wave velocity. Here the wave velocity is a function of space and is proportional to square root of Young's modulus of the material. Our goal is to characterize wave velocity variation along the length of inhomogeneous samples. To enable this, wave propagation data were collected over the length of two different samples. Because wave velocity is proportional to the elastic modulus of the material, prediction of the wave velocity over space can be used to evaluate spatial variations in material integrity. In this work, PINN was used as an inverse problem solver. Considering that most PINN papers published to date consider data obtained by numerical analysis or analytic solution, the results shown here serve to show the potential of PINN for handling experimental data.

Experimental setup and sample description
Two cylindrical rod samples were used in the experiments: one is composed of borosilicate glass and the other of portland cement mortar. Both samples are 25.4 mm in diameter and 147 cm in length. The glass sample is pristine and ostensibly homogeneous without any obvious or known damage. The mortar sample consists of two distinct sections along its length: a "strong" matrix design with a water-tocement (w/c) ratio of 0.5 and a "weak" design with a w/c of 0.6, with a distinct boundary between the two mixtures that exists at about one-third length of the sample. For both mortar mixtures, the cement-to-sand ratio is 1/3 by mass. Figure 3 shows the experimental setup from which wave propagation data were collected along each sample's length. To generate mechanical (ultrasonic) waves in the samples, piezoelectric (PZT) discs are attached to the flat end of each sample. The waves generated by the PZT discs propagate along the length of the samples and are detected using an aircoupled transducer (ACT) positioned above the outer surface; ACTs offer the advantage of not requiring physical contact with the samples and the ability to collect large amounts of consistent data quickly. A 1-D linear lead screw actuator was used for the motion stage to control the position of the ACT along the sample length. For both samples, ultrasonic signal data were measured every 5 mm along the length of the samples using a sampling rate of 12.5 MS/s. The distance (liftoff) between ACT and the samples was set and maintained at 65 mm.  Figure 4 shows the PINN architecture used in this work. The first layer is the input layer, which accepts two signal features: space and time. It is common to normalize data to improve convergence speed when ANNs are used. One drawback, however, of normalizing data before feeding into a model is that the physical meaning of the input data is eliminated. Therefore, rather than using normalized input data, a normalization layer is added into the architecture of the ANN so that the model can retrieve original data values through the process of back-propagation. The normalization layer calculates the mean value and standard deviation with regard to the input data before the training process begins. The calculated values are stored in the normalization layer and they are used to convert the input data set to have a mean of 0 and a standard deviation of 1, which is z-score normalization. After that, two separate 1-D sets of scalar input data are concatenated into one vector in the subsequent concatenate layer. Then the normalized data are passed into the fully connected layer network. As shown in Figure 4, the network comprises a 'Main network' and other a b subnetworks. The main network is used to predict output values (displacement) and other subnetworks are used to predict coefficients of the embedded differential equation; one subnetwork is used to predict wave velocity ( ), which is the principal output value of interest. This latter subnetwork considers spatial coordinate information so that the prediction of the wave velocity is a function of the spatial coordinate. Note that the true output values of the main network (displacement) are only provided during the training process, and those of the subnetwork (wave velocity) are not provided. A total of 4 hidden layers, each with 40 neurons, comprise the main network and 3 hidden layers, each with 20 neurons, comprise the subnetwork. The output value ( ) from the main network is double-differentiated with respect to the input data ( , ). Finally, the differentiated values ( / , / ) are multiplied with corresponding coefficients to calculate the residual of the implemented physics-based equation. That residual is defined as

Detail of PINN model
and the cost function used in this work is defined as where . The Adam optimizer is used in the training process using an initial learning rate of 5e -4 . The objective of using this PINN model is to predict the wave velocity using the embedded equation and training output, which is the displacement. The predicted wave velocity can be related to Young's modulus ( ) and density ( ) using the equation / to monitor material compliance changes along the long rod-shaped samples where is the bar velocity and is considered to represent . Figures 5 (a) and (b) show mechanical wave propagation measurement data for the glass and mortar samples, respectively. Signal data were measured every 5 mm along the length of the samples. All of the multiple wave measurements are presented at once using the time-space domain, where the amplitude of the received signal is normalized to have values from -1 to 1 as represented by the color scale. The signal amplitude from the ACT sensor is directly proportional to the measured acoustic pressure.

Experimental results and PINN predictions
When absolute values of displacement are not needed, the small-strain, and harmonic wave propagation assumption enables us to consider that the signal amplitude is proportional to relative surface displacement. A 14-cycle tone burst signal with a 90 kHz center frequency was used as the excitation source for the glass sample, and 75 kHz center frequency for the mortar sample. For both cases, the excitation voltage to the PZT sender was 40 V. In both material samples, coherent ballistic propagating wave fronts can be observed without noticeable dispersion; thus, it is reasonable to apply the simple 1-D wave equation here. In order to reduce the number of training data, a time window (250~350 µs for the glass data and 300~450 µs for the mortar sample) was applied to the original data, and only that part of the signal within the window was used. 20 % of the data were randomly selected to be set aside and thus were not used for PINN training; these data were later used to test the performance of PINN with regard to unknown spaces. A total of 61,299 and 91,924 data points were used for the glass and mortar samples, respectively. Although the amount of data may seem large, one training data set consists of only three scalar quantities, so the absolute data set sizes are relatively small at 500 and 740 kB, respectively.   The wave velocity prediction results over length for the glass and mortar samples are shown in Figures 7 (a) and (b), respectively. The green solid line represents the experimentally obtained wave velocity profile calculated by connecting multiple zero-crossing points of the wave fronts to determine time of flight for specific, known measurement positions over the length of the sample; we consider this to be the true wave velocity profile although this is not fully correct because of inherent measurement error and the simplifying linear wave propagation approximation. The black solid line connecting black and red points indicates wave velocity prediction results from the PINN model. As mentioned earlier, 20 % of data were retained as test data (red points) and the remaining 80 % of the data (black points) were used for training the PINN. Although no significant wave velocity prediction error was observed at the red points for both tests, improved prediction performance is expected if more training points are used.
In the case of the glass sample, the wave velocity shows variation along its length where the mean value for the "true" wave velocity profile obtained from measurement data is 5163.3 m/s, while that from PINN is 5146.4 m/s, resulting in an overall relative error of -0.33 % for the PINN model. For the mortar sample, a notable change (reduction) in the wave velocity is observed to start at around 180 mm from the first measurement point in Figure 7 (b), which matches the expected location of the boundary of the two component sections ("strong" and "weak") within the mortar sample. Based on the measurement results, the mean value of the experimentally obtained wave velocity in the strong section is 3967.

Summary, Conclusions, and Perspectives
In this paper the basic structure and function of artificial neural networks (ANN) and physics-informed neural networks (PINN) are introduced, and the potential for PINN to contribute to material characterization tasks are considered. We suggest that PINNs show potential for effective application to a broad range of problems related to construction materials issues, in particular flow in porous media and characterization of inhomogeneous media. A specific application of PINN analysis for mechanical wave propagation in homogeneous and inhomogeneous media is presented, where material wave velocity profile as a function of space was predicted. We conclude that this particular application of PINN is useful in that it can be used, using one measurement set, to characterize inhomogeneous material properties as a function of space, and thus may be useful to evaluate material integrity variations knowing the connection between wave velocity and material properties. Although only a simple 1-D case was considered in this work, the concept can be readily expanded to 2-D or 3-D cases, which suggests that this approach can be applied to more realistic structural elements, such as beams or plates, to characterize in-place material properties. Furthermore, other meaningful wave propagation characteristics, such as wave attenuation coefficient or wave nonlinear parameters can be extracted through appropriate modifications to the PINN structure. These additional characteristics could be used to characterize sophisticated material properties or to diagnose early damage.
It is clear that the development and deployment of PINN to engineering and science problems is a rapidly emerging field, and one that shows great potential. However, PINN is not a universal tool and one can, and should, question whether PINN can completely replace existing analytical, numerical, or purely physics-based methods. PINNs exhibit many aspects that need improvement, for example choice of hyperparameters, spectral bias, balancing between loss terms, computational cost, etc. At the same time it should be noted that PINN is in its infancy; as future technological and computing developments enable rapid collection of large data sets, increase in computing power, and rapid growth and acceptance of the ANN algorithm, we expect PINN will emerge as a helpful and commonly applied tool to solve difficult problems related to construction materials research.