## Info

Contracted

Primitive

Contracted

Primitive

Contracted

Primitive

cc-pVDZ

2s1p

4s

3s2p1d

9s4p

4s3p2d

12s8p

cc-pVTZ

3s2p1d

5s

4s3p2d1f

10s5p

5s4p3d1f

15s9p

cc-pVQZ

4s3p2d1f

6s

5s4p3d2f1g

12s6p

6s5p4d2f1g

16s11p

cc-pV5Z

5s4p3d2f1g

8s

6s5p4d3f2g1h

14s8p

7s6p5d3f2g1h

20s12p

cc-pV6Z

6s5p4d3f2g1h

10s

7s6p5d4f3g2h1i

16s10p

8s7p6d4f3g2h1i

21s14p

order polarization function. For second row systems it has been found that the performance is significantly improved by adding an extra tight d-function.27

The energy-optimized cc-basis sets can be augmented with diffuse functions, indicated by adding the prefix aug- to the acronym.28 The augmentation consists of adding one extra function with a smaller exponent for each angular momentum, i.e. the aug-cc-pVDZ has additionally one s-, one p- and one d-function, the cc-pVTZ has lslpldlf extra for non-hydrogens and so on. The cc-basis sets may also be augmented with additional tight functions (large exponents) if the interest is in recovering core-core and core-valence electron correlation, producing the acronyms cc-pCVXZ (X = D, T, Q, 5). The cc-pCVDZ has additionally one tight s- and one p-function, the cc-pCVTZ has 2s2p1d tight functions, the cc-pCVQZ has 3s3p2d1f and the cc-pCV5Z has 4s4p3d2f1g for non-hydrogens.29

### 5.4.7 Polarization consistent basis sets

The basis set convergence of electron correlation methods is inverse polynomial in the highest angular momentum functions included in the basis set, while the convergence of the independent-particle HF and DFT methods is exponential.30 This difference in convergence properties suggests that the optimum basis sets for the two cases will also be different, especially should low angular momentum functions be more important for HF/DFT methods than for electron correlation methods as the basis set becomes large. Since DFT methods (Chapter 6) are rapidly becoming the preferred method for routine calculations, it is of interest to have basis sets that are optimized for DFT type calculations, and that are capable of systematically approaching the basis set limit. The polarization consistent (pc) basis sets are developed analogously to the correlation consistent basis sets except that they are optimized for DFT methods.31 The name indicates that they are geared towards describing the polarization of the (atomic) electron density upon formation of a molecule, rather than describing the correlation energy. Since there is little difference between HF and DFT, and even less difference between different DFT functionals, these basis sets are suitable for independent-particle methods in general.

The polarization consistent basis sets again employ an energetic criterion for determining the importance of each type of basis function. The level of polarization beyond the isolated atom is indicated by a value after the acronym, i.e. a pc-0 basis set is

Basis |
Hydrogen |
First row elements |
Second row elements | |||

Contracted |
Primitive |
Contracted |
Primitive |
Contracted |
Primitive | |

pc-0 |
2s |
3s |
3s2p |
5s3p |
4s3p |
8s6p |

pc-1 |
2s1p |
4s |
3s2p1d |
7s4p |
4s3p1d |
11s8p |

pc-2 |
3s2p1d |
6s |
4s3p2d1f |
10s6p |
5s4p2d1f |
13s10p |

pc-3 |
5s4p2d1f |
9s |
6s5p4d2f1g |
14s9p |
6s5p4d2f1g |
17s13p |

pc-4 |
7s6p3d2f1g |
11s |
8s7p6d3f2g1h |
18s11p |
7s6p6d3f2g1h |
20s16p |

unpolarized, pc-1 contains a single polarization function with one higher angular momentum, pc-2 contains polarization functions up to two beyond that required for the atom, etc. In contrast to the cc-pVxZ basis sets, the importance of the polarization functions must be determined at the molecular level, since the atomic energies only depend on s- and p-functions (at least for elements in the first two rows in the periodic table). For the DZ and TZ type basis sets (pc-1 and pc-2), the consistent polarization is the same as for the cc-pVxZ basis sets (id and 2d1f), but at the QZ and 5Z levels (pc-3 and pc-4) there are one and two additional d-functions (4d2f1g and 6d3f2g1h), respectively. The s- and p-basis set exponents are optimized at the DFT level for the atoms, while the polarization exponents are selected as suitable average values from optimizations for a selection of molecules. The primitive functions are subsequently contracted by a general contraction scheme by using the atomic orbital coefficients.

For properties dependent on the wave function tail, such as electric moments and polarizabilities, the convergence towards the basis set limit can be improved by explicitly adding a set of diffuse functions, producing the acronym aug-pc-n.

### 5.4.8 Basis set extrapolation

The main advantage of the ANO, correlation consistent and polarization consistent basis sets is the ability to generate a sequence of basis sets that converges toward the basis set limit in a systematic fashion. For example, from a series of calculations with the 3-21G, 6-31G(d,p), 6-311G(2d,2p) and 6-311++G(3df,3pd) basis sets it may not be obvious whether the property of interest is "converged" with respect to further increases in the basis, and it is difficult to estimate what the basis set limit would be. This is partly due to the fact that different primitive GTOs are used in each of these segmented basis sets, and partly due to the lack of higher angular momentum functions. From the same (large) set of primitive GTOs, however, increasingly large ANO basis sets may be generated by a general contraction scheme that allows an estimate of the basis set limiting value. Similarly, the cc-pVxZ basis sets consistently reduce errors (both HF and correlation) for each step up in quality. In test cases it has been found that the cc-pVDZ basis can provide ~65% of the total (valence) correlation energy, the cc-pVTZ ~85%,cc-pVQZ ~93%,cc-pV5Z ~96% and cc-pV6Z ~98%, with similar reductions of the HF error.

Given the systematic nature of the cc basis sets, several different schemes have been proposed for extrapolation to the infinite basis set limit, using the highest angular momentum Lmax included in the basis set as the extrapolating parameter.32 At the HF and DFT levels the convergence is expected to be exponential, and indeed functions of the form shown in eq. (5.10) in connection with the cc-pVxZ basis sets usually provide a good fit.33

An alternative fitting function (eq. (5.11)) for use with the pc-n basis sets has been shown to improve the accuracy of absolute energies by almost an order of magnitude, although relative energies are only marginally improved.34 The number of s-functions (Ns) in the basis set is here used as the main extrapolating parameter.

Exponential forms like eq. (5.10) have also been used for extrapolating the total energy at correlated levels of theory with the cc-pVxZ basis sets. Theoretical analysis, however, suggest that the correlation energy itself (i.e. not the total energy, which includes the HF contribution) should converge with an inverse power dependence, with the leading term for singlet electron pairs being (L + 1)-3 while the leading term for triplet pairs is (L + 1)-5.35 The theoretical assumption underlying these results is that the basis set is saturated in the radial part (e.g. a TZ type basis set should be complete in the s-, p-, d- and /-function space).This is not the case for the correlation consistent basis sets: even for the cc-pV6Z basis set, the errors due to insufficient numbers of s- to i-func-tions are comparable with that from neglect of functions with angular momentum higher than i-functions. Nevertheless, it has been found that extrapolations based on only the leading L- term give good results when compared with accurate results generated by for example R12 methods.36 This has the advantage that the infinite basis set result can be estimated from only two calculations with basis sets having maximum angular momentum N and M according to eq. (5.12).

It has been suggested that a separate extrapolation of the singlet (opposite spin) and triplet (same spin) correlation energies with A + B(L + V2)-3 and A + B(L + V2)-5 function forms, respectively, may provide better results.37

The main difficulty in using the cc-pVxZ or pc-n basis sets is that each step up in quality roughly doubles the number of basis functions. The fitting functions in eqs (5.10) and

(5.11) contain three parameters, and therefore require at least three calculations with increasingly larger basis sets. The simplest sequence is cc-pVDZ, cc-pVTZ and cc-pVQZ, but the cc-pVDZ basis is too small to give good extrapolated values for the correlation energy, and a better sequence is cc-pVTZ, cc-pVQZ and cc-pV5Z. The requirement of performing calculations with at least the cc-pVQZ basis places severe constraints on the size of the systems that can be treated. The extrapolation based on eq.

(5.12) has the advantage of requiring only two reference calculations. It should be noted that the B parameter in eq. (5.11) varies little from system to system, and taking this to be a universal constant also reduces eq. (5.11) to a two-parameter fitting function.

Perhaps the most interesting aspect of the analyses that led to the development of the correlation consistent basis sets is the fact that high angular momentum functions are necessary for achieving high accuracy. While d-polarization functions are sufficient for a DZ type basis, a TZ type should also include /-functions. Similarly, it is questionable to use a QZ type basis for the sp-functions without also including three d-, two /- and one g-function in order to systematically reduce the errors. It can therefore be argued that an extension of for example the 6-31G(d,p) to 6-311G(d,p) is inconsistent as the second set of d-orbitals (and second set of p-orbitals for hydrogen) and a set of /-functions (d-functions for hydrogen) will give similar contributions as the extra set of sp-functions. Similarly, the extension of the 6-311G(2df,2pd) basis to 6-311G(3df,3pd) may be considered inconsistent, as the third d-function is expected to be as important as the fourth valence set of sp-functions, the second set of /-functions and the first set of g-functions, all of which are neglected.

In the search for a basis set converged value, other approximations should be kept in mind. Basis sets with many high angular momentum functions are normally designed for recovering a large fraction of the correlation energy. In the majority of cases, only the electron correlation of the valence electrons is considered (frozen-core approximation), since the core orbitals usually are insensitive to the molecular environment. As the valence space approaches completeness in terms of basis functions, the error from the frozen-core approximation will at some point become comparable to the remaining valence error. From studies of small molecules, where good experimental data are available, it is suggested that the effect of core electron correlation for unprob-lematic systems is comparable with the change observed upon enlarging the cc-pV5Z basis, i.e. of a similar magnitude as the introduction of ^-functions.38 Improvements beyond the cc-pV6Z basis set have been argued to produce changes of similar magnitude to those expected from relativistic corrections for first row elements, and further increases to cc-pV7Z and cc-pV8Z type basis sets would be comparable with corrections due to breakdown of the Born-Oppenheimer approximation for systems with hydrogen. Within the non-relativistic realm, it would therefore appear that basis sets larger than cc-pV6Z would be of little use, except for extrapolating to the non-relativistic, clamped nuclei limit for testing purposes. In attempts at obtaining results of "spectroscopic accuracy" (~0.01kJ/mol), a brute force calculation with for example the cc-pV7Z quality basis set combined with explicit extrapolation has been shown to become problematic,37 and such high-quality results must probably be obtained by explicit correlated techniques, such as the R12 method discussed in Section 4.11.

There is a practical aspect of using large basis sets, especially those including diffuse functions, that requires special attention, namely the problem of linear dependence. Linear dependence means that one (or more) of the basis functions can be written as a linear combination of the other, i.e. the basis set is overcomplete. A diffuse function has a small exponent and consequently extends far away from the nucleus on which it is located. An equally diffuse function located on a nearby atom will therefore span almost the same space. A measure of the degree of linear dependence in a basis set can be obtained from the eigenvalues of the overlap matrix S (eq. (3.51)). A truly linearly dependent basis will have at least one eigenvalue of exactly zero, and the smallest eigenvalue of the S matrix is therefore an indication of how close the actual basis set is to linear dependence. As described in Section 16.2.3, solution of the SCF equations requires orthogonalization of the basis by means of the S- /2 matrix (or a related matrix that makes the basis orthogonal). If one of the S matrix eigenvalues is close to zero, this means that the S- /2 matrix is essentially singular, which in turn will cause numerical problems if trying to carry out an actual calculation. In practice, there is therefore an upper limit on how close to completeness a basis set can be chosen to be, and this limit is determined by the finite precision with which the calculations are carried out. If the selected basis set turns out to be too close to linear dependence to be handled, the linear combinations of basis functions with low eigenvalues in the S matrix may be discarded.

## Post a comment