nach oben

Optical Memory and Neural Networks

Erschienen in:

Open Access 01.12.2023

Investigating the Efficiency of Using U-Net, Erf-Net and DeepLabV3 Architectures in Inverse Lithography-based 90-nm Photomask Generation

verfasst von: I. M. Karandashev, G. S. Teplov, A. A. Karmanov, V. V. Keremet, A. V. Kuzovkov

Erschienen in: Optical Memory and Neural Networks | Ausgabe 4/2023

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Patentsuche

Aus

Abstract

The paper deals with the inverse problem of computational lithography. We turn to deep neural network algorithms to compute photomask topologies. The chief goal of the research is to understand how efficient the neural net architectures such as U-net, Erf-Net and Deep Lab v.3, as well as built-in Calibre Workbench algorithms, can be in tackling inverse lithography problems. Specially generated and marked data sets are used to train the artificial neural nets. Calibre EDA software is used to generate haphazard patterns for a 90 nm transistor gate mask. The accuracy and speed parameters are used for the comparison. The edge placement error (EPE) and intersection over union (IOU) are used as metrics. The use of the neural nets allows two orders of magnitude reduction of the mask computation time, with accuracy keeping to 92% for the IOU metric.

Publisher’s Note.

Allerton Press remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 INTRODUCTION

Photolithography plays a significant role in modern microchip manufacturing processes. The choice of lithography technique depends on the size of the smallest chip element and production volume. With the node of 180 nm or less (90, 65, 32 nm) the wave nature of the 193-nm source used in exposure brings about optical proximity effects [1]. These effects of effects can distort the microchip geometry and reduce the yield [2]. The major pattern irregularities are line end shortening, rounded corners, spurious inter-feature bridges, narrower or wider line widths, etc. [3]. Modern photolithography technologies involve a number of resolution enhancement methods enabling the correction of optical proximity effects [4]. Some of these methods introduce calculated changes to the photomask geometry to compensate for errors caused by diffraction effects. Two methods can be used to correct the optical proximity effects for the 90-nm and 180-nm nodes. The first method is optical proximity correction (OPC), which is based on either rule-based approach [5] or a model-based approach [6]. The second method, inverse lithography technology (ILT), enables the optimization of the shapes of the openings in photomasks [7]. The use of ILT requires considerable computational resources and takes much time. The more photomasks are used to manufacture a microchip and the smaller the feature size is, the more time and computation power are required [8]. Transition to a finer node leads to both an increase in the number of photomasks per chip and an increase in the computation time per single layer [9].

The use of machine learning-based methods of artificial intellect [10] and neural networks [11] can help without possibly to reduce the consumption of time and computational resources in OPC and ILT. Some research teams work on uniting machine learning-based-OPC approach and EDA technical solutions to speed up the computation of photomask sets [12]. The use of machine learning and neural networks in computational lithography is considered in [13]. Earlier we investigated the efficiency of using different machine learning (ML) algorithms in the OPC technique [14]. The present paper focuses on the use of artificial neural nets for the computation of a photomask corresponding to that calculated by ILT.

The use of generative adversarial networks (GAN) for calculating photomasks is one of the possible solutions. The approach can be used either by itself [15] or in combination with a conventional method [16] which quickens the learning and increases the computation accuracy. Another solution involves specialized convolutional neural networks [17, 18] which are modified standard CNNs [19] or newly developed CNNS [20].

In our research, we deal with a 90 nm chip fabrication process. The OPC techniques we have to hand correspond to lithography equipment parameters and resist properties, which restrict the use of other machine learning models and neural networks. For this reason, we have generated original data sets using Calibre Workbench. The objective of the research is to determine the efficiency of available CNNs in terms of computation time and accuracy. U-net [21], Erfnet [22] and DeepLab3+ [23] are used in the comparison.

2 THE DATA SETS, HARDWARE AND SOFTWARE USED

2.1 Data Sets

We used EDA Calibre to generate two data sets of random topology for 90 nm node to train the networks. Each training sample consists of a couple: topology (Fig. 1a) and ILT-generated mask (Figs. 1b or 1c depending on the dataset used; the difference between them is explained below).

The first data set holds the original topology (Fig. 1a) and the results of its simulation without using the MRC (mask rule check) compliance option (Fig. 1b).

In the second data set the ILT algorithm with the MRC compliance option is used to compute the target mask (Fig. 1c). As the mask fabrication process has its technological restrictions, it is necessary to use of the MRC compliance option in ILT-based computations.

The mask corresponds to the most difficult computation case—the transistor gate correction. The patterns are reduced to binary black-and-white images: the white pixels correspond to the feature and the black ones are the background. The image resolution is 768 × 768 pixels. Each pixel corresponds to a 5 × 5 nm topology element. The critical dimension of the generated features is 90 nm. Each data set consists of 1536 couples of images and is divided into three parts: a 70% part holds training data, a 10% part carries validation data for controlling the overfitting, and a 20% part contains test data for model evaluation.

2.2 Hardware and Software

The Intel Gold 6254 CPU and Calibre EDA were engaged to compute the MRC-providing mask. The total computation time was 30 min, which corresponds roughly to a 1 s time per pattern. The laptop version of the Nvidia RTX 3080 GPU and MRC-providing data set were used to train U-Net, ErfNet and DeepLabV3+ networks for solving the inverse problem. The PyTorch framework [24] was used in bouth cases.

3 THE NEURAL NETWORKS USED IN COMPUTATIONS

3.1 Neural Networks Architectures

U-Net [21] was the first network used in the experiments. The network proved its worth in segmentation problems and image-to-image translations [25]. The network structure we use in the experiment is similar to that of U-Net (see Fig. 2). It consists of a convergence channel (encoder) at the left and a divergence channel (decoder) at the right.

The convergence channel (encoder), which is a typical structure of a convolutional neural network, performs successive 4 × 4-convolution operations with striding (2, 2) and padding (1, 1) followed by BatchNorm and ReLU. Unlike the original U-Net, our network has no maxpooling layers. These are replaced by size-reducing striding. Besides, unlike the original research, we use padding to make the output and input of our network have the same size and avoid the loss of information caused by cropping. We also apply BatchNorm, which was not used in the original work.

The decoder carries out consecutive steps that increase the feature map resolution and concatenates the resultant maps with the corresponding feature map from the encoder channel. Each operation is followed by the BatchNorm and ReLU activation function except the last step which is followed by a sigmoid activation function to ensure the network output of 0 to 1. The total net size is 1 829 025 parameters.

ErfNet is the second neural network we use in the experiment. Its structure agrees with without fully the architecture offered in [22]. The important quality of this net is the use of a special unit of consecutive convolutions (non-bottleneck-D), which enables us to reduce the number of parameters in the encoder without having to reduce the number of convolution layers. The decoder used in the network engages fewer divergence layers, which permits an additional decrease in net parameters. Another advantage of the network over other net architectures is the image processing time and accuracy. The total net size is 2 063 130 parameters.

DeepLabV3+ is the third network structure we used in the experiment. The framework can use different types of constituent units. The comparative analysis of the efficiency of these units is given in [23]. The advantage of the network is that sparse convolution layers allow an increased perceptual area of a neuron, and the concatenation of representations obtained by different neuron perception areas enables the neural net to learn both local-area and wide-area relations. Different net arrangements engage the following main units: a backbone unit, an ASSP unit, a decoder and a few other units. In the experiment, we used a structure that consists of three modules: a backbone unit, an ASSP unit and a decoder unit. The backbone unit used a pre-trained neural net ResNet 101. The total size of the DeepLabV3+ network was 59 332 642 parameters.

3.2 Loss and Metrics

The loss function of pixel binary cross-entropy is used in learning. The IoU and MSE (mean square error) are used as metrics. The cross-entropy is computed at every point as

$$BCE\left( {p,t} \right) = {{\;}} - t\log \left( p \right) - \left( {1 - t} \right)\log \left( {1 - p} \right),$$

(1)

where t is the actual binary value of a pixel, p is a number from 0 to 1 predicted by the neural net.

Metric IoU is used to evaluate the quality of generated patterns:

$${\text{IoU}}\left( {A,B} \right) = \frac{{\left| {A \cap B} \right|}}{{\left| {A \cup B} \right|}}.$$

(2)

Metric IoU (known as Jaccard index, Intersection over Union (see relation (3)) is a number that varies from 0 to 1 and measures the similarity of the inner “volumes” of two objects). The IoU metric is computed between the output data of the neural network and target image (Fig. 3).

3.3 Edge Placement Error (EPE)

EPE is an error of placement of a pattern edge at a control point [26] (step Error calculation in Fig. 3, the error is marked in black). The EPE distribution characterizes the error of edge placement in a generated pattern. The error is computed at control points. Figure 3 shows a hypothetic ILT process. To compute the EPE, the output data of the neural network is transformed into GDSII; after that, Calibre WORKbench EDA is used to carry out the modelling. The computation engages 300 neural network-generated figures. The EPE is calculated using only the first data set (without MRC option), because training results of the second data set were not good enough (Fig. 5).

4 THE RESULTS

The models were typically trained 30 epochs until overtraining; each epoch took about 2 min. The results averaged over ten training processes are given in Table 1.

Table 1.

The results averaged over 10 realizations

Parameters/Network architecture	U-Net	ErfNet	DeepLabv3+
Dataset without MRC, IoU	0.9894	–	–
Dataset with MRC, IoU	0.921	0.923	0.924
Computation time for picture, ms	7	13	27

It is seen from Table 1 that all networks demonstrate similar quality indexes (they are the same to two places of decimal). Given the large number of runs, we can conclude that the neural networks have the same computation accuracy. It should be noted that the Unet structure demonstrates the least computation time: it is twice as shorter as ErfNet time and four times as shorter as DeepLabV3+. It allows us to conclude that the Unet structure is the fastest framework.

For comparison, the computation of a single element using conventional methods with Calibre EDA takes about 1 s.

As can be seen from Table 1, on the dataset without MRC, IoU achieves 0.9894, i.e. the conventional ILT methods and neural net-based training gave almost the same results (the difference is about 1%). In this case, the accuracy of the correction becomes a decisive factor. To measure it EPE metric is used.

The EPE-based evaluation of our model is shown in Fig. 4. The EPE histograms of the neural network and the original ILT model are used as a quality measure.

It is seen from Fig. 4 that the EPE distribution of the neural network is closer to zero. It means the neural network gives better results than the original ILT algorithm. Besides, the ILT algorithm-based computation of one pattern takes about a second, while the neural network takes 7 ms, i.e., the network works two orders of magnitude faster.

Unfortunately, we cannot achieve the same results on the dataset with MRC. For this dataset the IoU metrics is only about 0.92 (see Table 1). Moreover, as seen in Fig. 5, the neural net-generated outputs have noticeably rounded corners. Typical to all three networks, this effect can be regarded as their main drawback.

5 CONCLUSIONS

Similar to other inverse problems, the inverse lithography is a class of problems where the use of neural networks looks most natural. The solution generated by a neural network can be easily tested by a direct numerical modeling. Unlike the iterative process of solving inverse problems, this sort of modeling doesn’t take much time.

With the first data set the non-MRC ILT method demonstrated that the neural network works faster and gives more accurate results than standard algorithms employed in the industry. On the validation sample of 300 pattern couples the Unet-based neural network showed better results than the original inverse lithography model, the single-pattern computation time having decreased by about 100 times. Thus, we can conclude that the approach offers an acceptable level of efficiency.

Given the second data set, the MRC-assisted ILT algorithm demonstrates insufficient accuracy of networks (about 92% in terms of the IOU measure). In this case the main drawback is the presence of blurred edges in the output patterns, especially; the blurriness is especially dramatic near the pattern angles. The approach needs further investigations to cope with the drawback. Since the extensive way of increasing the number of model layers (the use of ErfNet) and the total number of parameters and layers (the use of DeepLabV3+) proved ineffective, we regard the use of new learning methods and other convolution layers as next steps towards good results.

CONFLICT OF INTEREST

The authors of this work declare that they have no conflicts of interest.

Open Access.This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Publisher’s Note.

Allerton Press remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Nächster Artikel Video Codec Using Machine Learning Based on Parametric Orthogonal Filters

Chien, P. and Chen, M., Proximity effects in submicron optical lithography, Optical Microlithography VI, International Society for Optics and Photonics, 1987, vol. 772, pp. 35–41.

Balasinski, A., Gangala, H., Axelrad, V., and Boksha, V. (1999, December). A novel approach to simulate the effect of optical proximity on MOSFET parametric yield, in International Electron Devices Meeting 1999, Technical Digest, Cat. IEEE., no. 99CH36318, pp. 913–916.

Wong, A.K.K., Resolution Enhancement Techniques in Optical Lithography, SPIE Press, 2001, vol. 47.CrossRef

Balan N.N. et al., Basic approaches to photoresist mask generation models in computational lithography, High. Schools Bull., Proc. Electron. Eng., 2020, vol. 22, no.4, 2020, pp. 279–289.

Otto Oberdan, W., Garofalo Joseph, G., Low, K.K., et al., Automated optical proximity correction: a rules-based approach, Optical/Laser Microlithography VII, International Society for Optics and Photonics, 1994, vol. 2197, pp. 278–293.

Li, J. et al., Model-based optical proximity correction including effects of photoresist processes, Optical Microlithography X, International Society for Optics and Photonics, 1997, vol. 3051, pp 643–651.

Hung, C.Y., Zhang, B., Tang, D., Guo, E., Pang, L., Liu, Y., and Wang, K., First 65 nm tape-out using inverse lithography technology (ILT), in 25th Annual BACUS Symposium on Photomask Technology, SPIE, 2005, vol. 5992, pp. 596–604.

Krasnikov, G.Ya. and Sinyukov, D.V., Advanced optical proximity correction methods, problems and roadmaps, in Primary Problems of the Component Base and Materials for IT and Control Systems, Proceedings of the RAS Science Board, 2019, vol. 1, no. 3, p. 17,

Spence, C. and Goad, S., Computational requirements for OPC, Design for Manufacturability through Design-Process Integration III, International Society for Optics and Photonics, 2009, vol. 7275, p. 72750U.

10.

Choi, S., Shim, S., and Shin, Y., Machine learning (ML)-guided OPC using basis functions of polar Fourier transform, Optical Microlithography XXIX, International Society for Optics and Photonics, 2016, vol. 9780, p. 97800H.

11.

Choi, S., Shim, S., and Shin, Y., Neural network classifier-based OPC with imbalanced training data, IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., 2018, vol. 38, no. 5, pp. 938–948.CrossRef

12.

Shi, B. et al., Fast OPC repair flow based on machine learning, Design-Process-Technology Co-Optimization for Manufacturability XIV, International Society for Optics and Photonics, 2020, vol. 11328, pp. 113281B.

13.

Shin, Y., Computational lithography using machine learning models, IPSJ Trans. Syst. LSI Des. Method., 2021, vol. 14, pp. 2–10.CrossRef

14.

Tryasoguzov, P.E., Kuzovkov, A.V., Karandashev, I.M., et al., Using machine learning methods to predict the magnitude and the direction of mask fragments displacement in Optical Proximity Correction (OPC), Opt. Mem. Neural Networks, 2021, vol. 30, pp. 291–297.CrossRef

15.

Ye, W. et al., LithoGAN: End-to-end lithography modeling with generative adversarial networks, 2019 56th ACM/IEEE Design Automation Conference (DAC), IEEE, 2019, pp. 1–6.

16.

Yang, H. et al., GAN-OPC: Mask optimization with lithography-guided generative adversarial nets, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., 2019, vol. 39, no. 10, pp. 2822–2834.CrossRef

17.

Sun, X. et al., U-Net convolutional neural network-based modification method for precise fabrication of three-dimensional microstructures using laser direct writing lithography, Opt. Express, 2021, vol. 29, no. 4, pp. 6236–6247.CrossRef

18.

Shao, H.C. et al., From IC layout to die photograph: A CNN-based data-driven approach, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., 2020, vol. 40, no. 5, pp. 957–970.CrossRef

19.

Ma, X. et al., Model-driven convolution neural network for inverse lithography, Opt. Express, 2018, vol. 26, no. 25, pp. 32565–32584.CrossRef

20.

Ma, X., Zheng, X., and Arce, G.R., Fast inverse lithography based on dual-channel model-driven deep learning, Opt. Express, 2020, vol. 28, no. 14, pp. 20404–20421.CrossRef

21.

Ronneberger, O., Fischer, P., and Brox, T., U-net: Convolutional Networks for Biomedical Image Segmentation, Int. Conf. on Medical image computing and computer-assisted intervention, Cham: Springer, 2015, pp. 234–241.

22.

Romera, E. et al., Erfnet: Efficient residual factorized convnet for real-time semantic segmentation, IEEE Trans. Intell. Transp. Syst., 2017, vol. 19, no. 1, pp. 263–272.CrossRef

23.

Chen, L.C. et al., Encoder-decoder with atrous separable convolution for semantic image segmentation, Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 801–818.

24.

Pytorch Documentation. https://pytorch.org/docs/stable/index.html.

25.

Jun-Yan Zhu, Taesung Park, Phillip Isola, and Efros, A.A., Unpaired image-to-image translation using cycle-consistent adversarial networks, in IEEE International Conference on Computer Vision (ICCV), 2017.

26.

Medvedev, K.A., Kuzovkov, A.V., and Ivanov, V.V., Algorithm and methodology for increasing the OPC-recipe efficiency.https://doi.org/10.22184/NanoRus.2019.12.89.368.372

Titel: Investigating the Efficiency of Using U-Net, Erf-Net and DeepLabV3 Architectures in Inverse Lithography-based 90-nm Photomask Generation
verfasst von: I. M. Karandashev
G. S. Teplov
A. A. Karmanov
V. V. Keremet
A. V. Kuzovkov
Publikationsdatum: 01.12.2023
Verlag: Pleiades Publishing
Erschienen in: Optical Memory and Neural Networks / Ausgabe 4/2023
Print ISSN: 1060-992X
Elektronische ISSN: 1934-7898
DOI: https://doi.org/10.3103/S1060992X23040094

Springer Professional

Investigating the Efficiency of Using U-Net, Erf-Net and DeepLabV3 Architectures in Inverse Lithography-based 90-nm Photomask Generation

Abstract

Publisher’s Note.

1 INTRODUCTION

2 THE DATA SETS, HARDWARE AND SOFTWARE USED

2.1 Data Sets

2.2 Hardware and Software

3 THE NEURAL NETWORKS USED IN COMPUTATIONS

3.1 Neural Networks Architectures

3.2 Loss and Metrics

3.3 Edge Placement Error (EPE)

4 THE RESULTS

5 CONCLUSIONS

CONFLICT OF INTEREST

Publisher’s Note.

Premium Partner

Springer Professional

Abstract

Publisher’s Note.

1 INTRODUCTION

2 THE DATA SETS, HARDWARE AND SOFTWARE USED

2.1 Data Sets

2.2 Hardware and Software

3 THE NEURAL NETWORKS USED IN COMPUTATIONS

3.1 Neural Networks Architectures

3.2 Loss and Metrics

3.3 Edge Placement Error (EPE)

4 THE RESULTS

5 CONCLUSIONS

CONFLICT OF INTEREST

Publisher’s Note.

Weitere Artikel der Ausgabe 4/2023

Review on Pest Detection and Classification in Agricultural Environments Using Image-Based Deep Learning Models and Its Challenges

Data Augmentation and Fine Tuning of Convolutional Neural Network during Training for Person Re-Identification in Video Surveillance Systems

Enhancement of Knowledge Distillation via Non-Linear Feature Alignment

Lessen Pressure Drop and Forecasting Thermal Performance in U-Tube Heat Exchanger Using Chimp Optimization and Deep Belief Neural Network

Application of the Luminescent Carbon Nanoparticles for Optical Diagnostics of Structure-Inhomogeneous Objects at the Micro- and Nanoscales

Information Added U-Net with Sharp Block for Nucleus Segmentation of Histopathology Images

Premium Partner