- 145 Views
- Uploaded on
- Presentation posted in: General

Angle Resolved x-ray Photoelectron Spectroscopy, ARXPS – Experience in the Wafer Processing Industry so far

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Angle Resolved x-ray Photoelectron Spectroscopy, ARXPS – Experience in the Wafer Processing Industry so far

- C.R. Brundle, C.R. Brundle & Associates, Soquel, CA
- G. Conti, Y. Uritsky – DTCL, Applied Materials, Santa Clara, CA
- - J. Wolstenholme, Thermo Inc.

- Our practical experience using ARXPS for determining the following:
- 1. Thickness for nominally single overlayer films (0-40Å) – Characterization and Metrology
- 2. Composition depth distribution (0-40Å) – Characterization
- Note: “Dose” is a sub-set of composition (Metrology?)
- Taken as a given that XPS is a powerful technique for elemental and chemical state identification for 0-40Å films.

Acknowledgements: - Charles Wang, Ghazal Peydaye-Saheli

at Applied Materials

ARXPS – Experience in the Wafer Processing Industry

- Will use our experience over a 3-4 year period with 10-30Å Si/O/N gate oxide material, as produced in development by Applied Materials wafer processing tools and processes for Semiconductor Industry customers.
- Will refer to a few other necessary “illustrative examples” along the way.

What does this industry want?

- THICKNESS
- High Precision (better than 1% at 1σ repeatability / reproducibility for a 10Å film).
- For Metrology, fast (seconds per point), 5/9 point maps on 300mm wafers.
- Accuracy is of less concern. For metrology of no concern. Will be calibrated anyway, and a λeff (“effective attenuation length”).
- Would like to be able to distinguish “apparent thickness variations” from what are really materials changes.
- → λeff changing with material change
- → λeff changing with t

What does this industry want?

- DOSE (e.g. N in Si/O/N; As in Si(100))
- 1% precision at 10Å for 1x1015 atoms/cm2.
- Accuracy is again of less concern, BUT need to distinguish “apparent dose changes” from depth distribution changes.
- DEPTH DISTRIBUTION
- A crude distribution is OK (layer model approach?).
- BUT it needs to be reproducible and correct.
- Would like to be able to detect small variations in a given distribution (e.g. wafer to wafer or point to point on a wafer).

Other Issues

- For the Si/O/N work described here we assume flat (low roughness), laterally homogenous (over analysis area) films. We know this to be true.
- For Hf based high k work the above is not always true.
- All the work is done using the Theta 300Thermo Inc tool.
- All the “recipe development” for converting ARXPS data to depth profiling is done by P. Mack at Thermo Inc. We are merely users, though we do have the freedom to vary some parameters.
- We often have to make correlations with data from the ReVera tool (Gate group at Applied Materials), which is a single angle only tool designed specifically for metrology (t, N dose) in Si/O/N

Theta Probe – Parallel ARXPS (PARXPS)

Theta Probe avoids the disadvantages by collecting all angles in parallel.

The Theta Probe ARXPS Solution

Two Dimensional Detector

Measures Energy and Angle Simultaneously

- Angular Range
- 20° to 80°

- Parallel collection
- Up to 96 channels in angle
- Generally, 16 angles are used giving an angular resolution of 3.75°

- Up to 112 channels in energy

- Up to 96 channels in angle
- Parallel collection allows rapid ‘snapshot acquisition’
- Excellent for ARXPS maps
- Thickness maps
- Dose maps

- Intensity as a function of depth
- 65% of signal from <l
- 85% from <2l
- 95% from <3l

- Information depth greater than thickness of gate dielectric

l = Inelastic Mean Free Path (0.4 - 4nm)

160000

O1s

140000

120000

100000

Counts / s

80000

60000

Si2p

C1s

40000

N1s

20000

0

1100

1000

900

800

700

600

500

400

300

200

100

0

Binding Energy (eV)

18

N1s

Si4+

Si2p

16

15

14

12

10

10

8

6

18

5

C1s

4

16

2

14

O1s

4000

12

108

106

104

102

100

98

96

406

404

402

400

398

396

394

392

390

Binding Energy (eV)

Binding Energy (eV)

10

3000

8

Counts / s

6

2000

4

2

1000

295

290

285

280

275

Binding Energy (eV)

0

540

538

536

534

532

530

528

526

Binding Energy (eV)

- Information depth varies with collection angle
- I = I ¥exp(-d/lcosq )

- Spectra from thin films on substrates are affected by the collection angle

- XPS as a function of the angle, q , (w.r.t. the surface normal) that the photoelectrons leave the surface

- A set of measurements over a range of q provides composition information over a range of depths.

ІSi

ІSi

ІSi

ІSi

∞

∞

4+ 2

[1-exp (-d/λSi , SiO )]

4+

0

4+

0

=

exp(-d/λSi , SiO cosθ)

02

[1-exp (-d/λSi , SiO )]

R = R∞

4+ 2

exp(-d/λSi , SiO cosθ)

02

λSiSiO ~λSiSiO (KE’s are nearly the same)

So reduces to:

ln [1+R/R∞] = d/(λSi, SiO cosθ)

02

4+2

2

Thickness Determination

- Based on the classical approach of determining the ratio of overlayer / substrate XPS intensities and using the Beer-Lambert equation and values for λ.
- For Si/O/N on Si(100) the overlayer signal is Si4+ and the substrate is Si0.

………

………

………

Thickness Determination

- Many sources for λSi , Si (and λ values in general – see C. Powell publications)
- Classical approach ignores elastic scattering, λe. We know (Powell et al) this can cause significant errors, so that a λeff should be used, and that the errors vary with thickness, so that λeff becomes a function of t.
- The effects of elastic scattering get greater at higher θ (more grazing angle), over representing the substrate, leading to a low estimate of t if a fit is made to equation 3 that includes data at high θ (see later).
- Our values of λ come from the Thermo Inc algorithm. They are calculated on the basis of formula, density, band gap, and KE.

4+ 4+

ІSiO

σSi, SiOλSi, SiO DSiO FSiλSi, SiO

∞

R∞

2

2

2

2

2

=

=

=

x

σSi, SiλSi, Si DSi FSiOλSi, Si

ІSi

∞

2

Thickness Measurement : Testing Model Validity

9.0 nm

6.4 nm

- Silicon Dioxide on Silicon
- Plot:ln[1+R/ R ¥] vs. 1/cos(q)
- Fitting:Fit through the origin
- Gradient:= d/l
- NOMINAL THICKNESS VALUES FROM ELLIPSOMETRY

4.3 nm

3.6 nm

2.3 nm

1.9 nm

Comparison of XPS Results To Ellipsometry

- SiO2 on Si
- Excellent linearity
- Ellipsometry included C layer in thickness
- The offset will change as a function of time as more contamination is picked up

Ellipsometry included C in layer thickness

Thickness

- data considered for Si/O/N on Si(100)
- 1) 8 sample set with t~ 10-30Å
- N% age~ 7-30%
- - 4 from process A; 4 from process B
- - Determine d, N dose, and Max. Ent. Derived depth profile
- - Only one set of experimental data, but evolving treatment over a 3 year period.
- Note: very large t and N% range– not typical for metrology

Thickness:

- Quality of Data?
- Manual Fits- Operator Influence?
- - Repeatability by single operator?
- Effect of changing composition (N% age), which is large here?
- Effects of angular range used?
- - Depends on thickness, material
- - Consequence for single angle determination?
- Effect of composition variation with depth?
- → Automated 3-layer model (p. Mack, Thermo)
- - No operator dependency
- - Completely reproducible
- - Iterative fit to 3-layer depth distribution model and t (i.e. value of N dose and it’s distribution effect, t)

Single Overlayer Model for Film Thickness:

quality of data?; manual fits?

A-11

There is ambiguity in assigning intensity

between the Si4+ and Si0 peaks

XPS Measurements of SiO2 Thickness:

Effect of angular range included?

- Comparison of ARXPS with fixed angle XPS
- Good agreement except at large thickness
- Single angle measurement samples large angular range.

- ARXPS measurements
- Effect of angular range upon measured thickness
- Minimum angle is 23° in all cases
- Highest usable maximum angle depends upon oxide thickness

J. Wolstenholme, Thermo, Inc.

Thickness

- 8 SAMPLE SETof Si/O/N– One set of experimental data, but how it has been processed has changed from 2003 to 2007. Note: Very large t and N% range

Single Overlayer Mod. for Film Thickness, slot 11

Thickness

Conclusions

- Precision of data is no problem
- Validity of model should be tested (ie use angular data and fit to equation. not just a single angle determination)
- For Si4+ (overlayer) / Si° (substrate) fit to data, operator dependence for manual fit can be a problem
- Automated fit (3 layer model) can be completely reproducible
- Relative accuracy depends on validity of parameters input – λ(f(t)?), density (f(N%age)), depth distribution (f(N%age)?)
- (e.g. 14.1Å for a 8.5%N film going to 20.1Å for a 23.7%N film, found using the manual non-iterative model, is a very different %age change compared to 13.8Å going to 16.8A in the 3 layer model)

N Dose

- So far have been only listing “N%age”; i.e. the usual XPS approach of peak intensities corrected for photoionization cross-section. This assumes homogenous composition.
- Dose is the total amount of N in the film.
- If uniform distribution Dose = N%.t.C
- If non-uniform, N%.t.C becomes an “Apparent Dose”
- - The “Apparent Dose” can be greater or less than true dose, depending on depth distribution
- - ReVera single angle approach?
- ∙ Initially – assumed a depth distribution???
- ∙ Now – determines a depth distribution from a Tougaard background approach.
- Theta 300/Thermo : N dose by integrating N depth profile distribution from (a) Full Max Ent approach or
- (b) 3-layer model (automated).

So, we need to know N distribution to get true N dose

True dose < Apparent Dose

CN

CN

d

d

Effect of Distribution on Dose Calculation

CN = N Concentration

d = depth

- True dose >Apparent Dose

N Dose (x e15 atoms/cm2): 8 sample set

(from integrating N depth distribution; discussed later)

N%∙t VS N Dose

N Dose 3 layer Jan 2007 versus N Dose June 2003.

N Dose

Conclusions

- NEED CALIBRATION/VERIFICATION BY MEIS!
- Striking agreement between 3-layer model and the June 2003 Max Ent results, except for very high N content (even though large differences in estimated t!).
- June 2003 – About 8% spread from pure N%·t approach.
- 3-layer – About 15% spread from pure N%·t approach, but linear
- Limiting angular range (66°-55°) produces up to 10% variation (because Max Ent derived depth profile is different).
- Note: very large dose variations are being considered here. Not usual for metrology.

Ultra-Thin Film Depth Profiling by ARXPS Status

- Because of the short mean free path lengths, λ, of the photoelectrons generated and used in XPS, non-destructive depth profiling is limited in the depth it can effectively go to
- 65% from < 1 λ; 85% from < 2 λ; 95% from 3 λ
- λ ranges from 0.5nm to 4nm (material and electron energy dependant)

- How limited depends on level of detail wanted
- ARXPS quite capable of detecting a substrate > 3 λ down, but not profiling the 3 λ overlayer or giving a precise thickness
- Detailed profiling possible up to ~ 2 λ thickness
- Reliability of profile obtained by ARXPS?
- Relative Depth Plot, RDP - QUALITATIVE but simple, fast, model independent
- Maximum Entropy Method - QUANTITATIVE, but modelled and requires experience or a ‘recipe’

Processing the data– RDP

A relative depth index can be calculated using:

An indication of the layer order can then be achieved by plotting out the relative depth index for each species.

ln{ }= RDP ratio

Peak Area (Surface)

Peak Area (Bulk)

Construction:

Collect ARXPS spectra

For each element, calculate:

C

HfO2/Al2O3

SiO2

Si

le

C1s

Surface

Al 2p

O1s

(Low BE)

Hf 4f

O 1s

(High BE)

Si 2p

(Ox)

RDP

Bulk

Si 2p (El)

RDP

- Information
- Reveals the ordering of the chemical species

ALD TaN Film: chemical state RDP

Angle Resolved Spectra from TaN Sample

TaOx

TaNt

Relative Depth Profile, RPD

- Advantages
- Fast
- Model independent, no assumptions

- Limitation
- No depth scale
- No concentration profile structures
- In my opinion an RDP is the most generally useful approach in ARXPS for characterization of unknown film structures seen during process development.

60

50

40

30

Atomic percent (%)

20

10

0

80

70

60

50

40

30

Angle (°)

Max. Ent. : Depth Profile Generation

Sample

Generate Random Profile

C

Al2O3

SiO2

Si

Calculate Expected ARXPS Data (Beer Lambert Law)

O

Si4+

Tj(q) = exp(-t/lcosq)

C

Sio

Al

Surface sensitive

More bulk sensitive

Depth Profile Generation (cont.)

- The MaxEnt solution is derived by minimising 2 while maximising the entropy
- Maximise the joint probability function
- Repeat process to obtain most likely profile

Determine error between observed and calculated data:

- Calculate the entropy associated with a particular profile (the probability of finding the sample in that particular state)
- cj,i is the concentration of element i in layer j

Reliability of Max Ent Modeling

- Simple model fit to the data can never be unique! The Max Ent approach (balance with Entropy) is a “regularization” approach. Detail of results are nearly always over-interpreted.
- Balance of2 and is operator (or recipe) chosen
- Requires experience with sample at that thickness
- Requires assumptions about ‘unrealistic solutions’
- e.g.Too spiky a distribution? 2weighting too high (or too small)
- e.g. Too smooth, substrate never reaches 100%, film elements never go to 0%? too big

- For a ‘simple’ film of < 2λ; with good statistics data; a substrate with no species common to the film; zero or small surface contamination
- Develop reliable recipe (2, , …verification?)
- Possible to obtain a reliable profile for system appropriate to that ‘recipe’ (see examples following)
- Is it for Si/O/N with t, N dose variations?

100

HfSiON

80

Si

Si0

60

At %

40

O

N

20

0

0

1

2

3

4

5

Si4+

Depth / nm

Hf

N

Hf

Si0

Total Si (MEIS)

O

Si4+

C1s (O)

O1s

C1s

SAM

S2p

Ag

Ag3d

TiW

Quartz

SAM = -S-(CH2)11-(O-CH2-CH2-)3-OH

Depth Profile

Relative Depth Plot

Ag 3d

C 1s (H/C)

C 1s (Ether)

O 1s

S 2p

Example of Max Ent Derived Depth Profile on an Ultra-Thin Si/O/NFilm

- Reliability?
- Need high quality angular data – good S/N
- Need “constraints” and a “recipe” for term

Effect of Depth Distribution on Peak Intensity Ratios

Extreme Example: answer qualitatively obvious from raw data or RDP, but cannot know whether detailed Max Ent distribution is valid without verification/calibration by some other method.

Repeatability of ARXPS Concentration Profiles

- Three ARXPS datasets acquired dynamically from point on a Si oxynitride sample (sample repositioned each time).
- Concentration profiles reconstructed from each dataset
- Good reproducibility of reconstructed profiles.

Relative Depth Plot for 8 sample set: process A

Set A

Relative Depth Plot for 8 sample set: process B

Set B

June 2003. Max. Ent. α=2e-4

Process A

Process 11A

t = 19.8Å

N% = 6.7%

N Dose = 8.57 x 1014 atoms/cm2

Process 1A

t = 14.1Å

N% = 8.5%

N Dose = 7.61 x 1014 atoms/cm2

Process 3A

t = 16.3Å

N% = 16%

N Dose = 1.70 x 1015 atoms/cm2

Process B

Process 15B

t = 10.4Å

N= 9.6%

N Dose = 6.33 x 1014 atoms/cm2

Process 3B

t = 11.2Å

N= 12.1%

N Dose = 8.41 x 1014 atoms/cm2

Process 13B

t = 14.2Å

N= 18.6%

N Dose = 1.72 x 1015 atoms/cm2

Si0

Si/O/N

SiO2

0

110

106

102

98

94

Binding Energy (eV)

Example of Chemical Depth Profiling, June 2003:

Distinction of Si-O Using Si Chemical Shifts

- Different Si 2p binding energy for Si4+ in SiO2 and Si/O/N allows separation in profile
- t = 21.1 Å N = 29.8%

SiO4

SiO3N

Film is actually more like this:

Post Oxidation?

SET 6B

Graded Region

SiO2

Si/O/N

Si (100)

SET 10A

t = 20.1Å

N = 23.7%

Slot 15

Slot 10

Slot 6

Slot 3

Slot 3*

Slot 1

Slot 11

Slot 13

Interface

Interface

Normalized Overlays of N Distribution, June 2003

- Set A and set B are very similar (not expected)
- N distribution does not change much with N total dose
- Hard to get more than 10% N absolute at surface (air oxidation and HC pickup will reduce N content)
- No evidence for a nitrogen spike at the surface, cf.TOF SIMS.
- (this was the original reason for studying these sets of samples)

Set 1

Set 2

0ctober 2003. Max. Ent. α=5e-007. Set A

0ctober 2003. Max. Ent. α=5e-007. Set B

June 2004. Max. Ent. α=5e-07. Process A

June 2004. Max. Ent. α=5e-07. Process B

d3

d2

SiO2

d1

Si3N4 + SiO2

SiO2

Si

- Assume 2 layers of SiO2, 1 layer SiOxNy, substrate
- Total d value is fixed from Si2p spectrum
- Adjust d1, d3 and N concentration to get best fit to ARXPS data
- Advantages
- Fast
- Only needs to fit 3 parameters (by least squares fitting)
- Easily automated

- Accurate
- Attenuation lengths can be calculated for each layer

- Precise
- Only needs to fit 3 parameters

- Fast

Silicon Oxynitride

Automated N distribution correction

Si0

Si0

Maximum Entropy Results

O

O

N

Sin+

Sin+

N

Si0

Si0

Automated N distribution

O

O

Sin+

Sin+

N

N

3-layer model Jan 2007(Can’t sell this to a process engineer!)

3-layer model Jan 2007

Effect of varying angular range. Jan 2007. α=5e007

Effect of varying angular range. Jan 2007. α=5e007

Effect of varying angular range. Jan 2007. α=5e007

Effect of varying angular range. Jan 2007. α=5e007

Comparison of 3-layer model to full Max. Ent June 2004

Comparison of 3-layer model to full Max. Ent June 2004

Comparison of 3-layer model to full Max. Ent June 2004

- 300 mm wafer
- Single measurement
- 49-point maps (after initial depth distribution determination at the wafer center)

Thickness

Dose

- Thickness can be obtained to the required precision for 10 to 40A homogeneous composition (lateral and in depth) films. The accuracy, or even relative accuracy depends on how much effort is put into calibration and what range of thickness or materials changes are occurring. For inhomogeneous films (lateral or depth) errors will occur, which will depend on the specifics. OK for thickness metrology. Comparison to ellipsometry?
- At one extreme, for a first time analysis of a new film composition, with little or no constraints on what could be the situation, do not go beyond a dimensionless qualitativeRelative Depth Profile approach (which can, nevertheless be extremely useful)
- At the other extreme, where a very constrained system is involved (ie you either already nearly know the answer, or the depth distribution is so extreme it is basically obvious from the raw data), ARXPS, plus appropriate data modeling, can give depth distributions to some degree, but never, in real situations, a unique highly precise profile.
- The quality of the data needed and the intellectual effort required to write (and verify) a “recipe” for fitting/modeling the data, which then only works within a narrow confine of constraints and, even then, only provides imprecise and not highly depth resolved information, means, in our opinion, that though ARXPS has its uses for characterization within the wafer industry, it is not suitable for rapid metrology intended to provide detailed information on depth distributions and related parameters which may rely on knowing the depth distribution (like dose for instance).
- May be OK for dose metrology for a small dose change range, using automated 3 layer model. Will be precise and reproducible, but only as accurate as the 3 layer model is accurate