I.INTRODUCTION
Artificial intelligence (AI) and machine learning (ML) are interdisciplinary domains that have found support and applications in all domains of science, technology, engineering, as well as business, finance, and economics. They have transformed the traditional model-based design and development process to a data-driven and learning process. On one side, AI is driven by the domain’s needs. On the other side, AI is applied in these domains to augment the performance and capacities of many applications.
II.NONCONTACT VITAL SIGNS MONITORING FROM VIDEO DATA
For the computer vision-based applications, AI enables computers and systems to derive meaningful information from digital images, videos, and other visual inputs and take actions or make decisions based on the low-level visual stimulus. Backed by AI and ML, computers can mimic our human beings to see, observe, and understand.
The noncontact measuring of vital signs by means of video data has become very popular in different applications such as monitoring of patients in clinical settings and for following subjects such as the elderly during daily activities [1]. In this article, we present some proposals that correspond to the estimation of the main vital signs as respiratory rate (RR), heart rate (HR), and oxygen saturation (OS) using video processing.
HR estimation is based on the principle of photoplethysmography (PPG) that tracks subtle color changes on the skin using a video camera. These subtle changes correspond to the oxygenation changes occurring during the cardiac cycle [2]. Chen and McDuff propose a method using motion representation based on a skin reflection model and a deep convolutional network using appearance information to guide motion estimation. This proposal has the advantage of being robust to the heterogeneous lighting and major motions [3]. Chaichulee et al. propose deep learning framework that allows tracking the cardiorespiratory signals for patients in a hospital ward to identify the time periods from which vital signs can be estimated. The system presents two convolutional neural networks (CNNs). The first network detects the presence of a patient and segments his skin area. The second network allows to identify and exclude the time periods of clinical intervention to estimate the vital signs only when the patient is calm [4]. Bousefsaf et al. present a fully automatic method based on convolutional 3D networks to estimate the HR that does not require any additional image preprocessing. Authors propose a particular training procedure that employs only synthetic data [5]. Yu et al. propose an end-to-end deep learning strategy to estimate the HR from highly compressed facial videos. The system presents a first network that enhances the spatiotemporal video and a second network that recovers the PPG signal to estimate the HR then. This strategy achieves equal performance on compressed videos with high-quality videos [6]. Hsu et al. present a deep learning framework for real-time estimation of HRs by using an RGB camera. The approach includes detection of the facial region of interests, illumination rectification, artifact removal, and PPG signal amplification. Then, a 2D time–frequency representation for signal characterization is used as input of a deep CNN [7].
OS estimation is based on the absorbance rate of red light and infrared light in PPG images. The challenge is the calibration of the system to obtain an accurate measure. Fatima et al. use the Eulerian video magnification to amplify the changes in skin color due to the heart cycle. The calibration has been made using a contact sensor as reference by means of a linear regression [8]. Moço et al. use the red and green channel to estimate the OS; nevertheless, the temperature cooling and the position of the ROI on the face are factors of variability of the estimation. To the best of our knowledge, there are no proposals in the literature that use AI to tackle the SO estimation problem using PPG images; hence, a research problem surely will be open [9].
RR estimation uses the principle of human body motion during the respiratory cycle, for example, the thorax motion to track the respiratory signal. Al-Naji and Chahl present a strategy using motion video magnification on a ROI to enhance the image frames, then using segmentation algorithms to classify the frames as inhalations or exhalations to estimate the RR [10,11]. Brieva et al. propose a strategy using motion magnification and a CNN to classify the whole frames as inhalations or exhalations and estimate the RR from this tagging [12].
III.PREVIEW OF THE STUDIES IN THIS ISSUE
Continuing with our strengths in supporting fundamentals in AI development and in industrial applications from the previous issue [13], we publish five papers focusing on datamining and their applications.
We selected three papers in this area focusing on data mining and denoising, and two papers in data set analysis and applications in traffic simulation and in green finance development and its impact on ecologicalization of urban industrial structure.
A.DATA MINING AND DENOISING
We selected three papers in this topic area that supports the fundamental issues in the development and improvement of artificial intelligence and machine learning research.
The first paper in this section studies biclusters and microarrays. Biclustering is a data mining technique that allows simultaneous clustering of the rows and columns of microarrays. These microarrays contain a large matrix of information. Such information is widely used in biology research and bio data analysis for monitoring the combinations of genes in different organisms. The paper proposes a coherent pattern mining algorithm based on all contiguous column bicluster. In this research, the continuous time changes are included in the coherent patterns in all continuous columns, that allow the co-expression patterns in time series to be searched. Simulation experiments are conducted to verify the biological significance of the mined biclusters, which include synthetic data sets and real gene microarray data sets. The performance of the algorithm is also evaluated, and the results show that the algorithm is highly efficient [2].
The second paper in this section discusses the CNN and its application in image denoising. The current studies often rely on a single deep network to train an image denoising model. These approaches often cannot achieve the best performance in complex images. This paper proposes a hybrid denoising CNN model, which consists of a dilated block, RepVGG block, feature refinement block (FB), and a single convolution. A dilated block combines a dilated convolution, a batch normalization, a common convolution, and an activation function of rectified linear activation function to obtain more context information. A RepVGG block is based on a structural re-parameterization technique, and it uses a parallel combination of convolution, feature refinement block, rectified linear activation function to extract complementary width features. The feature refinement block is used to obtain more accurate information via refining obtained features from the RepVGG block. These key components make the proposed hybrid denoising CNN have better performance in image denoising. Experiments are performed, and the results show a good denoising effect using public data sets [14].
The third paper in this section addresses denoising the visual image in haze weather environment. Existing dehazing algorithms conduct image dehazing by adjusting image brightness and contrast or constructing artificial priors, to achieve the desired color attenuation priors and dark channel priors. However, the results are often unstable when dealing with complex scenarios. In the approaches based on CNN, the image dehazing network for the encoding and decoding may not have considered the differences before and after the dehazing image, resulting in the lost of image spatial information in the encoding stage. To address these problems, this paper develops a new end-to-end two-stream CNN for single image dehazing. The network model consists of a spatial information feature stream and a high-level semantic feature stream. The former retains the original information of the dehazing image. The latter is extracted from the multiscale structural features of the dehazing image. Furthermore, a spatial information auxiliary module is developed and placed between the two feature streams. This auxiliary module employs the attention mechanism to construct a unified expression of different types of information and implements the gradual restoration of the clear image with the semantic information auxiliary spatial information in the dehazing network. A parallel residual twicing module is further used for dehazing the difference information of features at different stages to improve the model’s ability to differentiate haze images. A quantitative measure is defined and used for evaluating the similarity between the dehazing results of each algorithm and to the original image. The results show the proposed algorithm has a higher ratio than the existing algorithms for the same data set [14].
B.DATA ANALYSIS AND DOMAIN APPLICATIONS
We selected two papers in this topic area that applies data sets and their analyses to a domain of application.
The first paper in this topic area on traffic simulation using real data set to improve the shortest travel time from point A to point B. The simulation platform is a part of the Visual IoT/Robotics Programming Language Environment (VIPLE) [15]. The simulation experiments allow the user to program and control a particular vehicle to navigate through the map with traffic. The traffic can be generated randomly or in a controlled way. In this paper, the real data set recorded by Maricopa Country in Arizona government to generate the traffic in a controlled way, so that the controlled vehicle can dynamically recalculate the route based on the current traffic simulation [16]. The application creates an environment similar to Google Maps drive direction, with the capacity of finding the shortest-time path to the destination in dynamic traffic situations. Since the developed simulation platform can be programmed in a visual language and has a simplified user interface, the simulation experiments can be developed and conducted with a short learning curve for education purposes, and the experiment environment can be used in any course-related algorithm, transportation, scheduling, etc.
The second paper in this topic area investigates the factors affecting the ecologicalization of urban industrial structure. The paper quantitatively studies and analyzes the relevance between green finance development and the ecologicalization of urban industrial structure. It is based on a comprehensive index of green finance development using the panel data of target cities from the period 2012–2020 [17]. The analytic results illustrate that the green finance development significantly enhances the ecologicalization level of urban industrial structure. Particularly, green finance plays a stronger role in promoting the ecologicalization of industrial structure in economically developed regions than that in economically underdeveloped region. The research results can provide an invaluable policy guidance to the urban green financial market planning and related green product innovation [18].