100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Exam (elaborations)

A Machine-Learning Classification Approach for IC Manufacturing Control Based on Test Structure Measurements

Rating
-
Sold
-
Pages
7
Grade
A+
Uploaded on
08-08-2024
Written in
2024/2025

T metric electrical data which can then be used to evaluate the performance of an integrated circuit manufacturing process. The test chips contain test structures which provide specific electrical parameters. Coupled with a computer-controlled parametric test system, they can provide the user with information that cannot otherwise be obtained because of the nature of the measurement, the measurement time, or the cost associated with other methods of performing the measurement. Comprehensive electrical testing of complex test chips often results in large quantities of data which contain detailed information about a process. Very often, the large volume of these results places them beyond the ability of the user to readily or effectively interpret. When accurate conclusions cannot be made in a relatively short time period, the data from these tests are ignored and potential information provided by the testing is lost. Induction-based machine-learning algorithms are used for classifying large quantities of measured results or examples and for identifying relationships that can be expressed in the form of rules. The algorithm used in this work is the Iterative Dichotomiser version 3, or ID3, originally developed by Quinlan [ 11. The resultant classification rules can be implemented in an expert system shell. This combination can provide a means of training and customizing a diagnostic system to be responsive to proManuscript received August 19, 1988; revised October 3, 1988. The authors are with the National Institute of Standards and Technology, Gaithersburg, MD 20899. IEEE Log Number 8926444. cess variations experienced in a semiconductor manufacturing environment. This paper describes a means of developing rules for expert systems based on electrical measurements from custom-designed test chips. It is initially limited to evaluating the performance of a 1-pm lithography process in order to evaluate the procedure developed and to limit the bounds of the work. MACHINE-LEARNING METHOD Machine learning is the process of computer-based knowledge acquisition with the ability to apply that information in an effective manner to the solution of a problem. Induction is the process of using a known set of samples to generate a set of relationships, expressed as rules, that can explain these samples as well as others. These rules are derived through analysis of the information contained in a set of samples or the measured data set which is referred to as the training set. Each sample can be described by values of a finite set of parameters known as attributes. Every sample in the training set has a unique attribute known as the class, which assigns the sample to a particular category [2]. For example, a given training set may have attributes that consist of the average measured linewidth of certain lines on a wafer. The class assigned to the wafer can be the condition in which the wafer was processed, i.e., thin photoresist. The basic concept of the induction algorithm is to produce a set of relationships or rules which are ordered in terms of the maximum information present in the given data. The ID3 algorithm is used to recursively partition the training set into subsets until no subset contains elements from two or more different classes. This results in a classification structure known as a decision tree which represents the essential information in the training set. An example is seen in Fig. 1. The terminal nodes of the tree contain the class values. When the attributes are numeric, a binary tree-a tree that has exactly two branches for each node-is produced, with branching for specific attribute values. Those examples for which the value of the attribute is less than a determined splitting threshold value follow one branch, w

Show more Read less
Institution
A Machine-Learning Classification Approach For IC
Course
A Machine-Learning Classification Approach for IC









Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
A Machine-Learning Classification Approach for IC
Course
A Machine-Learning Classification Approach for IC

Document information

Uploaded on
August 8, 2024
Number of pages
7
Written in
2024/2025
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

Content preview

zyxwvutsrq
zyxwvutsrqp
IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 2 , NO. 2. MAY 1989



A Machine-Learning Classification Approach for IC
41




Manufacturing Control Based on Test Structure
Measurements


Abstract-This paper describes the use of a machine-learning method cess variations experienced in a semiconductor manufac-
for classifying electrical measurement results from a custom-designed turing environment.
test chip. These techniques are used for characterizing the perfor-
mance of a 1 - p n integrated circuit lithography process. The focus of
This paper describes a means of developing rules for
the work is to develop a method for producing reliable classification expert systems based on electrical measurements from
rules from data bases containing large samples of measurement data. custom-designed test chips. It is initially limited to eval-
The paper describes a test chip, data-handling methods, rule genera- uating the performance of a 1-pm lithography process in
tion techniques, and statistical data reduction and parameter extrac- order to evaluate the procedure developed and to limit the
tion techniques. An analysis of error introduced by noise in the rule
formation process is presented. bounds of the work.
MACHINE-LEARNING METHOD
INTRODUCTION Machine learning is the process of computer-based
knowledge acquisition with the ability to apply that in-
T EST CHIPS are used as a means of obtaining para-
metric electrical data which can then be used to eval-
uate the performance of an integrated circuit manufactur-
formation in an effective manner to the solution of a prob-
lem. Induction is the process of using a known set of sam-
ples to generate a set of relationships, expressed as rules,
ing process. The test chips contain test structures which
that can explain these samples as well as others. These
provide specific electrical parameters. Coupled with a rules are derived through analysis of the information con-
computer-controlled parametric test system, they can pro- tained in a set of samples or the measured data set which
vide the user with information that cannot otherwise be is referred to as the training set. Each sample can be de-
obtained because of the nature of the measurement, the
scribed by values of a finite set of parameters known as
measurement time, or the cost associated with other meth- attributes. Every sample in the training set has a unique
ods of performing the measurement. attribute known as the class, which assigns the sample to
Comprehensive electrical testing of complex test chips a particular category [2]. For example, a given training
often results in large quantities of data which contain de- set may have attributes that consist of the average mea-
tailed information about a process. Very often, the large
sured linewidth of certain lines on a wafer. The class as-
volume of these results places them beyond the ability of
signed to the wafer can be the condition in which the wafer
the user to readily or effectively interpret. When accurate
was processed, i.e., thin photoresist.
conclusions cannot be made in a relatively short time pe-
The basic concept of the induction algorithm is to pro-
riod, the data from these tests are ignored and potential duce a set of relationships or rules which are ordered in
information provided by the testing is lost. terms of the maximum information present in the given
Induction-based machine-learning algorithms are used
data. The ID3 algorithm is used to recursively partition
for classifying large quantities of measured results or ex- the training set into subsets until no subset contains ele-
amples and for identifying relationships that can be ex- ments from two or more different classes. This results in
pressed in the form of rules. The algorithm used in this
a classification structure known as a decision tree which
work is the Iterative Dichotomiser version 3, or ID3, orig- represents the essential information in the training set. An
inally developed by Quinlan [ 11. The resultant classifi- example is seen in Fig. 1. The terminal nodes of the tree
cation rules can be implemented in an expert system shell.
contain the class values.
This combination can provide a means of training and
When the attributes are numeric, a binary tree-a tree
customizing a diagnostic system to be responsive to pro- that has exactly two branches for each node-is produced,
with branching for specific attribute values. Those ex-
Manuscript received August 19, 1988; revised October 3, 1988. amples for which the value of the attribute is less than a
The authors are with the National Institute of Standards and Technol-
ogy, Gaithersburg, MD 20899. determined splitting threshold value follow one branch,
IEEE Log Number 8926444. while all other examples follow a second branch. The de-

U . S. Government work not protected by U. S . copyright

, 48 zy
zyxwv
IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING. VOL. 7 . NO. 2. MAY 1989


TESTCHIPDESIGNA N D FABRICATION

t
zyxwvutsrqponmlkj
Attribute 1
A test chip was designed containing polysilicon cross-
bridge resistor test structures [4] with design linewidths
of 0.5 to 4.0 pm. These structures are used to measure
the linewidth and sheet resistance of a conductive layer.
A layout of the test chip is seen in Fig. 2 with the areas
marked L containing arrays of cross-bridge resistor test
structures. Each test chip spans a 10 by 10 mm exposure
field which corresponds to the largest image field avail-


n
able for the equipment. The placement of the test struc-
I tures is such that the linewidth variation along the diag-




zyxwvu
Condition 1 Condition 2
onals of the test chips and the wafer could be measured.
Fig. 1. Decision tree This layout spans the largest image field possible.
Test chips were fabricated b j patterning an approxi-
termination of the splitting value of an attribute, using the mate 0.5-pm layer of LPCVD polysilicon on a 0.15-pm
ID3 algorithm, is based on the entropy calculated by [I] : film of silicon dioxide grown on silicon wafers. The poly-
silicon layer was subsequently doped with phosphorus and
H = -c Pi log Pi
I
(1) annealed. A wafer stepper exposure system with a 0.28
numerical aperture lens and a reactive ion etching system
where H is the average expected information or entropy was used to expose and delineate arrays of tests structures
of the training set and P i is the probability of an example on silicon wafers. Operators were instructed to perform




zyxwvutsrqpo
in the training set being in class i. The summation index all process steps in strict accordance to written process
i ranges over the classes represented in the training set. specifications.
The selection of a splitting attribute A and its correspond- One wafer lot, totaling 26 wafers, was fabricated with
ing value in the decision tree is based on the maximum a series of known exposure and etching conditions. Each
information obtained from this attribute. wafer contained 37 identical test chips having 90 cross-
Let H ( C ) represent the expected information content bridge test structures per chip. Groups of wafers in these
or entropy of a set of C examples, with the entropy of the lots were made with intentional baseline, overexposed,
empty set being zero. The possible choice of attribute A underexposed, overfocused (focus below wafer surface),
is based on maximum information gained from this attrib- and underfocused (focus above wafer surface) process
ute. If A is discrete, then the expected information content conditions as well as selected combinations of these con-
is ditions. These process conditions were selected by expe-
rienced operators to be representative of typical variations
B(C,A ) = c PA x H ( C j )
J
encountered for the process and equipment used in this
experiment.
P A = probability that the value A is Ai (2) Testing was performed using a computer-controlled
wherej ranges over the possible values of A and H ( C , ) parametric test system with a digital current source with
is the expected information content or entropy from all microampere resolution and a digital voltmeter with mi-
crovolt resolution. Measurements on similar samples have
examples in the set C with attribute A = A j . The best
attribute to choose is one that results in the maximum in- shown the electrical linewidth measurements to agree with
formation gain, G ( A ) of optical measurements to within the respective uncertain-
ties of both measurements [5].
G(A) = H ( C ) - B(C,A ) . (3)
The splitting process continues until all terminal nodes in LINEWIDTH MODELAND PARAMETER EXTRACTION
the decision tree are homogeneous, i.e., until they are Measurement results from the previous tests are repre-
identified by one class per node. The decision tree is thus sented by a model accounting for the linewidth variations.
ordered in such a manner as to use the maximum infor- A weighted least-squares fitting technique is used to ex-
mation in the attributes. This process results in the gen- tract K + 11 model parameters which represent the intra-
eration of an efficient decision tree structure but may re- chip linewidth variation (associated with systematic
sult in trees not easily understood by human experts [3]. linewidth variations due to the wafer stepper), the wafer
The ID3 algorithm can handle a large quantity of data linewidth variation (associated with wafer level lithogra-
where the computation time increases linearly with the phy variations such as overetching or photoresist uniform-
number of examples given, the number of attributes used, ity), and the average linewidth (where K is the number of
and the complexity of the concept to be developed. A de- lines per site). Statistical estimates of the significance of
tailed description of this process can be found elsewhere these fitted parameters as well as tests to measure model
[I]. adequacy and tester replication were also made. The
$15.49
Get access to the full document:

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Get to know the seller
Seller avatar
Ariikelsey
1.0
(1)

Get to know the seller

Seller avatar
Ariikelsey NURSING
View profile
Follow You need to be logged in order to follow users or courses
Sold
3
Member since
1 year
Number of followers
1
Documents
1053
Last sold
6 months ago

1.0

1 reviews

5
0
4
0
3
0
2
0
1
1

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions