I.INTRODUCTION

As an important carrier of human history and culture, the protection of the integrity and authenticity of historical buildings is of great significance for the inheritance of cultural heritage [1]. Over time, many historical buildings have been or are being damaged by various natural and human factors, such as earthquakes, floods, fires, wars, and urbanization [2]. Traditional historical building restoration techniques mainly rely on the experience of experts and manual operations. However, these technologies are time-consuming and costly and cannot guarantee the consistency of restoration results [3]. In recent years, digital restoration technology has become a research hotspot in the cultural heritage with the rapid development of artificial intelligence technology, especially computer vision and graphics technologies [4]. Digital restoration mostly uses three-dimensional (3D) modeling and reconstruction technologies, as well as artificial intelligence algorithms, to create high-precision 3D models of historical buildings and assist experts in restoration work [5]. However, how to accurately reconstruct and repair damaged historical buildings through computer algorithms remains a challenging problem. In response to this challenge, a digital restoration technology is proposed in this paper for historical buildings based on generative adversarial networks (GANs) and 3D point cloud (3D PC) reconstruction. The research aims to develop a high-precision and high-efficiency artificial intelligence restoration technology, which can provide strong technical support for cultural heritage protection. And 3D models of historical buildings can be accurately reconstructed through artificial intelligence to realize their digital restoration. 3D PC technology is innovatively combined with GANs to advance the automated and intelligent restoration process of historic buildings, providing new possibilities for the conservation and restoration of cultural heritage.

The research contains five parts. The first part is the introduction. It demonstrates the impact and contribution of 3D PC reconstruction and GAN in the field of digital restoration of historical buildings in the context of boost of modern information technology. The second part is a literature review, which elaborates on the research status and applications of scholars from around the world on digital restoration of historical buildings, 3D PC reconstruction, and GAN in various fields. The third part is an in-depth study on the digital restoration technology of historical buildings on the ground of 3D PC reconstruction and GAN. The first section provides a detailed study of digital restoration technology for historical buildings. The second section focuses on the digital restoration technology of historical buildings on the ground of 3D PC reconstruction and GAN. The fourth part tests the algorithm proposed in the research. The fifth part is a summary and outlook.

II.RELATED WORKS

Historical architecture is an important cultural heritage. Its protection and restoration work has received widespread attention worldwide. Among them, 3D PC reconstruction technology became a popular topic in heritage protection, and many scholars have conducted research on 3D PC reconstruction technologies. Yu Q et al. proposed the local intelligence AtlasNet, which limited each neural network to a specific part of reconstructing a 3D object. The research results indicated that this design was beneficial for applying multiple local constraints to the final reconstruction loss and better restoring 3D objects with fine local structures. It not only surpassed other methods but also generated structured point clouds with higher visual quality [6]. Li Y et al. proposed a cascaded network ADR-MVSNet with a multi-cost convolutional aggregation module. It included an adaptive depth reduction module for enhancing reconstruction accuracy and a multi-cost volume aggregation module for better estimating the depth of occluded pixels. The research results indicated that ADR-MVSNet reached highly accurate and finished reconstruction on the DTU and Tanks and Temple datasets, outperforming state-of-the-art benchmarks [7]. Zhang Y et al. proposed a descriptor called kernel density, which encoded the entire 3D space around feature points through kernel density estimation. The research results showed that the method had good description, robustness, and compactness, which achieved excellent results on the Terra Cotta Warriors fragment point cloud in the real world. This demonstrated its advantages and applicability in target registration recognition and 3D target reconstruction [8]. Su Y et al. proposed an attention module dual local attention (DLA) for learning DLA features and embedded an enhanced position encoding block, embedding the DLA module into various network architectures for point cloud segmentation. The research results indicated that DLA Net was more excellent existing state-of-the-art semantic segmentation methods on building facade datasets [9]. Bruno V D et al. used a novel 3D quantitative coronary angiography software to determine the correlation in coronary volume occlusion and myocardial scar expansion. The research results indicated that there was an essential correlation in the volume of obstructed left anterior descending (LAD) artery and the overall scar size during the acute phase. 32.8% of LAD artery volume was blocked, which could determine that the scar volume is greater than 20% of the entire lumen volume [10].

Advanced concepts and methods are integrated from multiple fields such as computer vision, deep learning, and digital architecture, providing new possibilities for the precise restoration and protection of historical buildings and making important contributions. Many scholars have conducted extensive research on GAN technology. Saeed A utilized language and vocoder features and generated an adversarial network framework through Wasserstein on the ground of the WGANsing model. The research results showed that the WGANsing model had been further improved after model overshoot. This highlighted that the WGANsing model also had the ability to synthesize songs from languages other than English [11]. Zhang H et al. proposed an attention generation adversarial network on the ground of contrastive learning for defect detection of colored patterned fabrics. This network consisted of two parts. The experiment showcased that the intersection and union values of the f1 measure of the network reached 38.25% and 51.67%, respectively, on the YDFID-2 public dataset. This demonstrated high detection accuracy [12]. Cheng M et al. presented a real-time prediction deep convolutional GAN for flood prediction. This network consisted of two stages: dynamic flow learning and real-time prediction. The research results indicated that the network was highly consistent with the predicted traffic of high-fidelity models, providing an effective tool for real-time traffic prediction [13]. Wang Y et al. proposed a blind image denoising method on the ground of asymmetric GAN. This method added an image downsampling layer in the generative model and the discriminative model, and it utilized a multi-scale feature downsampling layer to extract image features to reduce the impact of noise on the training image. The research results indicated that the performance was verified, with high performance and flexibility [14]. Wang P et al. proposed a new data generation method called conditional variational autoencoder and GAN. It was used to solve the data imbalance caused by irregular distribution of power plant operation data. The research results indicated that utilizing the dataset from the model for model training could improve the accuracy of NOx emission prediction [15].

In summary, the introduction of 3D PC technology and GAN in the digital restoration of historical buildings transforms traditional manual restoration methods into digital and intelligent restoration modes. This can achieve the restoration of historical buildings. However, these two methods still have some shortcomings, such as the need to improve accuracy in handling complex structures and details and the need for a large amount of training data. Therefore, this study introduces GAN and combines it with 3D PC reconstruction technology. This is to facilitate the optimization and expansion of digital restoration technology for historical buildings by improving the expressiveness and stability of the model. This has broad application prospects and profound social impact.

III.DIGITAL RESTORATION METHODS BASED ON 3D PC RECONSTRUCTION AND GANs

The restoration and reconstruction of historical buildings by 3D PC reconstruction based on artificial intelligence technologies and digital restoration by GANs are discussed. First, the application of artificial intelligence in 3D PC data processing and GANs framework is analyzed, and its theoretical basis and core algorithm are clarified. Second, artificial intelligence techniques are used to build 3D PC reconstruction and digital restoration framework of GANs. It is expected to provide a new theoretical foundation for the digital restoration of historical buildings and a new perspective for the protection and restoration of historical buildings.

A.3D PC RECONSTRUCTION AND GANs FRAMEWORK

Computer algorithms are used to convert these point cloud data into accurate 3D models of historical buildings to explore how to use modern computer science and technology to more accurately reconstruct and repair historical buildings [16]. It can provide basic information on the shape and size of historical buildings and better showcase the detailed features of buildings, such as texture and color [17,18]. Among them, GAN can better understand and learn the characteristics of historical buildings by training a set of adversarial neural network models. The 3D PC shape repair process under the encoder decoder network structure is shown in Figure 1.

Fig. 1. The 3D point cloud shape repair process under the encoder–decoder network structure.

In the 3D PC restoration, the key structural information of the damaged point cloud is captured by the encoder, while the decoder attempts to reconstruct the complete point cloud data. Encoder is usually a neural network that converts a 3D PC into a low-dimensional representation, and the conversion is called encoding. During this process, the main features of input data are captured while removing noise and irrelevant information. The decoder converts the encoding back to an approximate representation of the original data, which is the similarity between the decoded data and the original data. A high-fidelity decoding can preserve important features of the original data and accurately reconstruct the original data, thus precisely reflecting the structure and features of the original point cloud data. The loss function is to measure the difference between the output point cloud and the label point cloud. Among them, the Earth Mover's Distance (EMD) metric is mainly used to measure the distance between two distributions, and its definition is shown in equation (1):

dEMD(S1,S2)=minφ:S1S21n1xS1xφ(x)2

In equation (1), n1 serves as the total points in the S1 point cloud model. φ represents the bijectivity. S1R3 and S2R3. However, in practical applications, there is usually no direct reference to the original image. Therefore, in the absence of original images, it is possible to generate point clouds with higher visual fidelity by enhancing the learning ability of the generator within the GAN framework. The visual similarity between the repaired model and the original building should be ensured, including the accuracy of visual elements such as texture and color. And this not only depends on geometric accuracy but also includes understanding and reproducing the unique visual features of buildings. Meanwhile, Chamfer distance (CD) is used to calculate the square distance between each point in one set and the average nearest neighbor in another set, as defined in equation (2):

CD(S1,S2)=1nxS1minyS2xy22

In equation (2), n1 serves as the total points in S1 is the point cloud model. n2 represents the total points in the point cloud model of S2. The GAN framework is shown in Figure 2.

Fig. 2. GAN framework.

The GAN framework contains two independent neural networks, namely a generator and a discriminator. The generator attempts to generate data that are as close as possible to the distribution of real data, while the discriminator distinguishes whether the input data are from the generator or real data, thus forming a dynamic balance. The discriminative ability of the discriminator will also be continuously optimized through feedback, attempting to identify the forged data of the generator as much as possible. The mathematical definition of discriminant model D is showcased in equation (3):

D(x)=pdata(x)pz(z)+pdata(x)

In equation (3), x represents the true sample. pdata(x) serves as the true data distribution of x. pz(z) serves as the data distribution of the generated samples. The calculation for the objective loss function of optimizing and solving GAN is shown in equation (4):

minGmaxVD(D,G)=EXPdata(x)[log(D(x))]+Ezpz(z)[log(1D(G(z)))]

In equation (4), z represents a random noise vector. pz(z) serves as the data distribution of random noise variables. G(z) serves as the data distribution of the generated samples. E refers to finding the average expectation. The 3D PC shape restoration and GAN framework can achieve precise restoration of historical buildings. This provides new possibilities for the restoration of historical buildings, as shown in the framework in Figure 3.

Fig. 3. 3D point cloud shape repair and GAN framework.

The combination of 3D PC restoration and GAN framework is aimed at achieving precise restoration of 3D PC data through deep learning technology. In this process, the encoder and decoder work to maintain high fidelity between the original data and the encoded and decoded data to ensure that the repaired building model can accurately reflect the true form of historical buildings as much as possible. First, the 3D PC reconstruction technology is used to reconstruct damaged or incomplete historical buildings in 3D. Second, data training is performed through GAN to learn the data distribution corresponding to building characteristics. The 3D PC shape repair GAN adopts two parts: generative adversarial loss and reconstruction loss on the ground of recognizers. The definition of the loss function is showcased in equation (5):

LGAN=ExPdata(x)[log(D(X))]+Ex¯Pdata(x')[log(1D(G(X')))]

In equation (5), X represents the real point cloud sample input. G(X') means the point cloud sample generated by the adversarial network for the missing point cloud X'. pdata(x) serves as the distribution of real point cloud data. pdata(x') refers to the data distribution for generating point clouds. The definition formula for the overall joint loss function of the adversarial network generated by 3D PC shape repair is shown in equation (6):

Ljoin=α1LGAN+α2Lrecon

In equation (6), α1 and α2 represent the weights. During the training, the generator and discriminator constantly engage in confrontation. This can improve the generated data and the discriminative ability of the discriminator.

B.ESTABLISHMENT OF A DIGITAL RESTORATION FRAMEWORK BASED ON 3D PC RECONSTRUCTION AND GANS

Precise reconstruction and restoration of historical buildings can be achieved by combining computer science and deep learning technologies [19]. Accurate 3D models of historical buildings can be obtained by collecting and processing a large amount of point cloud data using 3D PC reconstruction technology. The GAN includes two adversarial neural networks. The generator learns and understands the characteristics of historical buildings to generate new repair results. The basic process of 3D PC reconstruction on the ground of stereo vision is shown in Figure 4.

Fig. 4. Basic process of 3D point cloud reconstruction on the ground of stereoscopic vision.

The 3D PC reconstruction of stereo vision involves fields such as image processing and computer vision [20]. First, multi-view images of the target object are collected through the device, and then feature extraction and matching are performed to determine the corresponding points between the images. This can reflect the 3D spatial position of the object. Camera parameters for triangulation are used to obtain the 3D coordinates of objects [21]. A surface reconstruction algorithm is utilized for obtaining a 3D model of the target object to address the noise and omissions in the actual collected point cloud data [22]. Assuming that the surface or scene of a spatial entity is an ideal reflection surface, the grayscale consistency function representing the patch is used, and its mathematical expression is shown in equation (7):

g(p)=1|V(P)/R(p)|IV(p)/R(p)h(p,I,R(p))

In equation (7), h(p,I1,I2) represents the grayscale consistency function of images I1 and I2 corresponding to patch p. Without considering the influence of obstructions or obstacles, V*(p) represents the set of images in V(p) that satisfy grayscale consistency, and its mathematical definition is shown in equation (8):

g*(p)=1|V*(P)/R(p)|IV*(p)/R(p)h(p,I,R(p))

In equation (8), R(p) represents the reference image of patch p, and images with grayscale differences between reference images R(p) less than the threshold α are retained. Each patch p has an image block corresponding to it in the visible image. The digital repair process on the ground of 3D PC reconstruction and GAN is shown in Figure 5.

Fig. 5. Digital repair process based on 3D point cloud reconstruction and GAN.

The digital restoration method for 3D PC reconstruction and GAN first collects point cloud data of the target object and establishes a 3D model. The generator outputs possible repair solutions after the model is input into the GAN, and the authenticity of the repair results is distinguished by the discriminator [23]. A repair result that is close to the real object is generated through the mutual confrontation between the generator and discriminator, achieving precise repair. The mathematical expression for the objective optimization function of GAN is showed in equation (9):

minGmaxDV(D,G)=ExPdata(x)[logD(x)]+Ezpz(z)[log(1D(z)))]

In equation (9), V(D,G) represents the function value. E means the expectation. x is the true sample. Pdata(x) serves as the true distribution of training samples. D(x) refers to the probability that the discriminator determines that it is a real image. z represents the noise of the input generator. Pz(z) means the probability of generating the model. G(z) is the image generated by the generator. The digital restoration framework is used for repairing missing or damaged objects, such as historical buildings and artworks. The digital repair framework on the ground of 3D PC reconstruction and GAN is shown in Figure 6.

Fig. 6. A digital repair framework based on 3D point cloud reconstruction and GAN.

The digital restoration framework is a 3D digital model that uses computer-aided technology to reconstruct the fidelity of missing or damaged objects [24]. These digital models store not only visual information of objects but also structural and geometric information. Physical restoration is the process of restoring damaged historical buildings or artworks, while digital restoration provides detailed repair and reference basis for this process. This framework obtains point cloud data on the surface of an object through a 3D scanning device. A 3D model of the object is established after processing and analyzing it with computer vision technology. Then the model is input into a GAN for adversarial learning. The generator produces data that are close to the real object, while the discriminator distinguishes between the real object and the generated object [25]. Finally, the data output from the generator is used to construct a 3D digital model of the object. When repairing an ancient building, experts can refer to a 3D digital model to understand the detailed structure of the missing part and then manually repair it to ensure that the repaired object is visually and structurally close to its original form. The fine generator is also similar to an encoding–decoding structure, consisting of a gated convolutional layer, an improved dilation gate convolutional layer, and contextual attention. Its theoretical expression is shown in equation (10):

Sx,y,x',y'=fx,yfx,y,bx',y'bx',y'

In equation (10), fx,y represents the feature of missing patches in the region. bx',y' means the characteristics of effective region patches. Sx,y,x',y' is the similarity between missing region features and known effective region features. The mathematical expression for its gated convolution to automatically learn update rules from the data is shown in equation (11):

{Gatingy,x=Wg·IFeaturey,x=Wf·IOy,x=φ(Featurey,x)σ(Gatingy,x)

In equation (11), Wg and Wf are two different convolutional filters, respectively. σ represents the sigmoid function. φ can be any activation function. Oy,x means two types of convolutional filters performing two convolutions on I. The digital restoration framework on the ground of 3D PC reconstruction and GAN can achieve accurate, efficient, and visual digital restoration of objects.

IV.TESTING AND ANALYSIS OF DIGITAL REPAIR BASED ON 3D PC RECONSTRUCTION AND GANs

The unified experimental environment was selected for testing and analysis to verify the digital repair performance of the fusion 3D PC reconstruction and GAN constructed in this research. The hardware environment is the Intel Core i7-9700K CPU, NVIDIA GeForce RTX 2080 Ti graphics card, DDR4 32GB memory. Hard disk SSD 512GB, and Artec Eva 3D Scanner, which is a 3D scanning device. The software environment includes the operating system Windows 10 64 bit Professional Edition, Point cloud processing software PointClouds Library 1.8.1, Deep learning framework TensorFlow 2.3.1, Programming language Python 3.8.5 in Development tool PyCharm, and Data visualization tool MeshLab. In the above hardware and software environments, it is tested and analyzed using a digital repair method on the ground of 3D PC reconstruction and GAN. Two main datasets were used in the testing, the first being the 3D cultural relic scanning dataset provided by XYZ Cultural Heritage Group. It comes from different museums and archaeological sites around the world, including 3D PC data of various cultural relics. Next is the architectural specific 3D dataset provided by Historical Monuments 3D Archive, which includes 3D scanning data obtained from different types of buildings such as ancient palaces, city walls, pagodas, altars and temples, and ancient tombs. Standardized structural similarity index (SSIM) and peak signal-to-noise ratio (PSNR) indicators for technical performance evaluation were provided by these datasets. Among them, the comparison of the SSIM results between ancient artifacts and buildings using different algorithms is shown in Figure 7.

Fig. 7. Comparison of SSIM results for antiquities and buildings using different algorithms.

Figure 7 showcases the excellent performance of using 3D PC reconstruction and GAN technology in the restoration of historical buildings. Significant advantages in the restoration of cultural relics and ancient buildings were shown. In Figure 7(a), as the number of repairs increased, the SSIM of all technologies showed an increasing trend. The SSIM value of the technology proposed by the research increased from the initial 0.38 to 0.94, and its growth rate and amplitude were significantly higher than other algorithms. In Figure 7(b), the SSIM value using the 3D PC reconstruction algorithm increased from 0.34 to 0.94, demonstrating strong repair ability. The growth rate of SSIM value using 3D laser scanning technology was relatively small, only increasing from 0.37 to 0.46. This validated the effectiveness and advantages of the 3D PC reconstruction algorithm in the restoration of ancient buildings. The comparison of SSIM results for different types of buildings using different algorithms is shown in Table I.

Table I. Comparison of SSIM results for antiquities and buildings using different algorithms

Project3D laser scanning technologyClose-range camera measurement technology3D printingOurs technology
Imperial palace buildings0.820.860.890.98
Defensive city walls0.790.870.820.97
Pagoda0.770.830.860.99
Altars and temples0.750.860.90.96
Mausoleum0.760.870.890.98

Table I shows that there are significant differences in the SSIM results obtained using different algorithms in the digital restoration of five different types of buildings (palace buildings, city walls, pagodas, altars and temples, and tombs). The SSIM value of 3D laser scanning technology varied between 0.75 and 0.82, indicating that its repair effect in various types of buildings was average. The SSIM value of close-up camera measurement technology was between 0.83 and 0.87, which was better than 3D laser scanning technology. The SSIM value of 3D printing ranged from 0.82 to 0.90, with the best restoration effect at 0.9 for altars and temples, but the restoration effect on the city walls was poor at only 0.82. The restoration effects of the technology proposed by the research on the pagoda and palace buildings were 0.99 and 0.98, respectively. The PSNR results of different algorithms for ancient artifacts and buildings are shown in Figure 8.

Fig. 8. PSNR results of ancient artifacts and buildings using different algorithms.

In Figure 8(a), the PSNR value of the GAN algorithm was 48.37, significantly surpassing other algorithms, verifying its advantage in image quality. The PSNR value of 3D laser scanning technology was only 40.62. The PSNR values of 3D printing technology and close-range camera measurement technology were relatively close, with values of 42.41 and 43.08, respectively, indicating that these two technologies performed similarly in terms of restoration quality. In Figure 8(b), the PSNR value of the GAN algorithm further increased to 49.08, demonstrating its excellent performance in image restoration. The PSNR value of 3D laser scanning technology decreased to 38.24, while the PSNR values of 3D printing technology and close-range camera measurement technology improved to 40.98 and 45.27, respectively. The comparison of the repair effects on different defects is showcased in Table II.

Table II. The repair effect of different algorithms on different defects

Evaluation index3D laser scanning technologyClose-range camera measurement technology3D printingOurs technology
Surface defect81.1488.3889.4693.34
Structural defect75.4889.1883.0793.25
Color defect76.3787.3886.2296.51
Decorative defect83.5680.0885.1694.67
Other defects82.3987.8188.6399.97

In Table II, compared to 3D laser scanning technology, close-range camera measurement technology, and 3D printing, the proposed technology showed superiority in repairing various defects in the repair of facial, structural, color, decorative, and other defects. The research technology was 12.2% higher than 3D laser scanning technology in repairing facial defects and 4.96% higher than close-range camera measurement technology. In terms of repairing structural defects, it was 17.77% higher than 3D laser scanning technology and 10.18% higher than 3D printing technology. In repairing color defects, it was 20.14% higher than 3D laser scanning technology. In repairing decorative defects, it was 14.59% higher than close-range camera measurement technology. In repairing other defects, it was 17.58% higher than 3D laser scanning technology and 12.16% higher than close-range camera measurement technology. The research technology was validated to have significant advantages in repairing various defects, whether it was facial, structural, color, decorative, or other defects, providing better results than other technologies. The restoration error results of different algorithms for different types of ancient buildings are shown in Figure 9.

Fig. 9. The restoration error results of different algorithms for different types of ancient buildings.

In Figure 9, both the digital restoration and GAN algorithms had an error of no more than 0.57 when dealing with various types of buildings. This indicated that these methods could effectively carry out digital restoration of historical buildings. The algorithm error of GAN was relatively small, with repair errors of 0.17, 0.12, 0.13, 0.11, and 0.09 for palace buildings, defense walls, pagodas, altars, temples, and tomb buildings, respectively. The error of 3D laser scanning technology was relatively high, with values of 0.47, 0.56, 0.44, 0.34, and 0.36, respectively. Therefore, GAN algorithms had advantages in digital restoration of historical buildings, with stable restoration effects and small errors. This is beneficial for the protection and restoration of historical buildings. The comparison of the repair results on loss patterns and cracks is shown in Figure 10.

Fig. 10. Comparison of repair results of loss patterns and cracks using different algorithms.

In Figure 10, the GAN algorithm exhibited excellent repair ability, with excellent repair effects for both lost patterns and cracks. In Figure 10(a), this technique showed a high degree of restoration without any obvious repaired marks when repairing the lost Buddha image pattern. In Figure 10(b), the GAN algorithm could repair the cracks completely without any trace, almost restoring them to the original state of the image. This indicated the superior performance of GAN algorithms in image restoration, demonstrating strong capabilities in dealing with complex loss patterns and crack problems. It could achieve accurate and high-quality repair results.

V.CONCLUSION

As the boost of computer technology and artificial intelligence, digital restoration technology is becoming increasingly important in the protection of historical buildings. Traditional repair methods rely on manual labor, which is time-consuming and may result in significant differences. A digital restoration technology for historical buildings on the ground of 3D PC reconstruction and GAN was proposed to address this issue. The experiment showcased that the use of 3D PC reconstruction and GAN algorithms showed superior performance in historical building restoration. The SSIM of this technology showed an increasing trend with the number of repairs. The SSIM value increased from the initial 0.38 to 0.94, and its growth rate and amplitude were significantly higher than other algorithms. The growth rate of SSIM value using 3D laser scanning technology was relatively small, only increasing from 0.37 to 0.46. The PSNR value of the GAN algorithm was 48.37, significantly surpassing other algorithms, verifying its advantage in image quality. The PSNR value of 3D laser scanning technology was only 40.62. The PSNR values of 3D printing technology and close-range camera measurement technology were relatively close, with values of 42.41 and 43.08, respectively. The study validated the effectiveness and advantages of 3D PC reconstruction algorithms in the restoration of ancient buildings. This study will continue to focus on addressing more complex issues in the restoration of historical buildings, to improve the digital restoration effect of historical buildings.