1 / 16

# Tao Xu - PowerPoint PPT Presentation

Parallel Simulated Annealing for EAM potential fitting. By Tao Xu CS6230 Final Presentation 05/05/05. Outline. Introduction to EAM potential The object/cost function Simulated Annealing Algorithm Synchronous Parallel Simulated Annealing Asynchronous Parallel Simulated Annealing

Related searches for Tao Xu

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Tao Xu' - Roberta

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

By Tao Xu

CS6230 Final Presentation

05/05/05

• Introduction to EAM potential

• The object/cost function

• Simulated Annealing Algorithm

• Synchronous Parallel Simulated Annealing

• Asynchronous Parallel Simulated Annealing

• Conclusions and References

The EAM potential

Where f(r) is a pair potential, r(r) an atomic density function and a embedding function U(n).

Let a indicate the entire set of L parameters a1, a2, …, aL used to characterize the functions.

The goal is try to determine the optimal set a* by matching the forces from first-principle calculations with those predicted by the classical potentials.

The key of force-matching method is to minimize the object function: Z(a) = ZF(a) + ZC(a) where,

M: # of sets of atomic configurations(e.g. structures).

Nk: # of atoms in configuration k.

Fki(a) is the force on the ith atom in set k obtained with parameter set a.

Fki0 is the reference force from first principle.

ZC: contains contribution from Nc additional constraints.

Ar(a) are physical quantities as calculated from potentials.

Initial configuration a

Random number generator

Create new random configuration a’

Evaluate the cost function

Acceptance probability

No

Yes

Accept new config

Terminate Search?

END

Send initial configuration

ROOT

0

Collect data from workers

Send best configuration

.

.

.

.

P1

P2

P3

P4

Collect final configuration

• There are three ways here:

• Minimum: The root chooses the configuration with the lowest cost function;

• Random: The root chooses one of the configuration at random;

• Metropolis-like: The root chooses the minimum sometimes but accepts others with some nonzero probability.

• In the current implementation, the minimum one is chosen at each temperature.

• The advantages of synchronous PSA:

• Attains a near-linear speedup. This is due to the fact that, with n processors, the program is searching a factor n more possible configurations, which increase the chances of “stumbling” onto the correct configuration more quickly.

• Easy to implement. The only message-passing occurs at the synchronous steps.

• The disadvantages of synchronous PSA:

• Idle time: If one processor obtains the prerequisite number of successes before another one, it must wait for other processor to finish.

• Synchronization cost: A global gathering and rebroadcasting of large configurations can be time-consuming. However, this is not usually a problem with smaller systems.

ROOT

Send initial configuration

Register

T

Collect data from workers

.

.

.

.

P3

P4

P1

P2

Collect final configuration

• Every processor controls its own cooling schedule;

• Each processor works independently with each other to avoid any idle time for waiting others to finish. When it finishes at one temperature, it checks its value against the global register. If its value is worse, it takes the configuration from the register. Otherwise, it writes its value to the register.

• The best configuration is always stored in a global register on a master processor.

• No processors ever sit idle. When a processor finishes at one temperature, it goes on to the next. However, there might still be some idle time at the end of the program.

• No expensive synchronization steps. Communications are smaller but more frequent.

• Simulated annealing always converges, but it takes a long time to find the minimum;

• Thus, parallelization of simulated annealing is desirable. Due to a faster perusal of the search space, a near linear speedup is obtained in the convergence time;

• Asynchronous annealing converges faster than synchronous due to the near-zero idle time and has a better speedup.

[1] M.S. Daw and M. I. Baskes: Embedded-atom method: Derivation and application to impurities, surfaces, and other defects in metals. Physical Review B, Vol. 29, No. 12, June 1984, pp. 6443-6453

[2] R. A. Johnson: Analytic Nearest-neighbor Model for fcc Metals. Physical Review B, Vol. 37, No. 8, March 1988, pp. 3924-3931

[3] S. Kirkpatrick, C.D. Gelatt, M.P. Vecchi: Optimization by Simulated Annealing. Science, Vol. 220, No. 4598, May 1983, pp. 671-780