22 텍스트 처리

22.1 String 객체

22.1.1 String 생성자

String 생성자:

%String%이다.
전역 객체의 "String" 프로퍼티 초기값이다.
생성자로 호출될 때 새로운 String 객체를 생성하고 초기화한다.
생성자가 아니라 함수로 호출되면 타입 변환을 수행한다.
클래스 정의의 extends 절 값으로 사용할 수 있다. 지정된 String 동작을 상속하려는 서브클래스 생성자는 [[StringData]] 내부 슬롯을 가진 서브클래스 인스턴스를 생성 및 초기화하기 위해 반드시 String 생성자에 대한 super 호출을 포함해야 한다.

22.1.1.1 String ( `value` )

이 함수는 호출될 때 다음 단계를 수행한다:

value가 존재하지 않으면
1. s를 빈 문자열로 둔다.
그렇지 않으면,
1. NewTarget이 undefined 이고 value 가 Symbol 이면 SymbolDescriptiveString(value)를 반환한다.
2. s를 ? ToString(value)로 둔다.
NewTarget이 undefined이면 s를 반환한다.
StringCreate(s, ? GetPrototypeFromConstructor(NewTarget, "%String.prototype%"))를 반환한다.

22.1.2 String 생성자의 프로퍼티

String 생성자:

값이 %Function.prototype%인 [[Prototype]] 내부 슬롯을 가진다.
다음 프로퍼티들을 가진다:

22.1.2.1 String.fromCharCode ( ...`codeUnits` )

이 함수는 나머지 매개변수 codeUnits 를 형성하는 임의 개수의 인수와 함께 호출될 수 있다.

호출되면 다음 단계를 수행한다:

result를 빈 문자열로 둔다.
codeUnits 의 각 요소 next 에 대해
1. nextCU를 ℝ(? ToUint16(next))의 숫자 값을 갖는 코드 유닛으로 둔다.
2. result를 result와 nextCU의 문자열 결합으로 설정한다.
result를 반환한다.

이 함수의 "length" 프로퍼티 값은 1_𝔽이다.

22.1.2.2 String.fromCodePoint ( ...`codePoints` )

이 함수는 나머지 매개변수 codePoints 를 형성하는 임의 개수의 인수와 함께 호출될 수 있다.

호출되면 다음 단계를 수행한다:

result를 빈 문자열로 둔다.
codePoints 의 각 요소 next 에 대해
1. nextCP를 ? ToNumber(next)로 둔다.
2. nextCP가 정수 Number가 아니면 RangeError 예외를 던진다.
3. ℝ(nextCP) < 0 또는 ℝ(nextCP) > 0x10FFFF 이면 RangeError 예외를 던진다.
4. result를 result와 UTF16EncodeCodePoint(ℝ(nextCP))의 문자열 결합으로 설정한다.
단언: codePoints 가 비어 있다면 result 는 빈 문자열이다.
result를 반환한다.

이 함수의 "length" 프로퍼티 값은 1_𝔽이다.

22.1.2.3 String.prototype

String.prototype의 초기값은 String 프로토타입 객체이다.

이 프로퍼티는 속성 { [[Writable]]: false, [[Enumerable]]: false, [[Configurable]]: false }를 가진다.

22.1.2.4 String.raw ( `template`, ...`substitutions` )

이 함수는 가변 개수의 인수와 함께 호출될 수 있다. 첫 번째 인수는 template 이고 나머지는 리스트 substitutions 를 이룬다.

호출되면 다음 단계를 수행한다:

substitutionCount를 substitutions의 요소 개수로 둔다.
cooked를 ? ToObject(template)로 둔다.
literals를 ? ToObject(? Get(cooked, "raw")))로 둔다.
literalCount를 ? LengthOfArrayLike(literals)로 둔다.
literalCount ≤ 0 이면 빈 문자열을 반환한다.
R을 빈 문자열로 둔다.
nextIndex를 0으로 둔다.
반복,
1. nextLiteralVal을 ? Get(literals, ! ToString(𝔽(nextIndex)))로 둔다.
2. nextLiteral을 ? ToString(nextLiteralVal)로 둔다.
3. R을 R과 nextLiteral의 문자열 결합으로 설정한다.
4. nextIndex + 1 = literalCount 이면 R을 반환한다.
5. nextIndex < substitutionCount 이면
  1. nextSubVal을 substitutions[nextIndex]로 둔다.
  2. nextSub를 ? ToString(nextSubVal)로 둔다.
  3. R을 R과 nextSub의 문자열 결합으로 설정한다.
6. nextIndex를 nextIndex + 1로 둔다.

Note

이 함수는 태그드 템플릿(13.3.11)의 태그 함수로 사용하도록 의도되었다. 그렇게 호출될 때 첫 번째 인수는 올바른 템플릿 객체이고 나머지 매개변수는 치환 값들을 담는다.

22.1.3 String 프로토타입 객체의 프로퍼티

String 프로토타입 객체:

%String.prototype%이다.
String 특수(exotic) 객체이며 그러한 객체에 지정된 내부 메서드를 가진다.
값이 빈 문자열인 [[StringData]] 내부 슬롯을 가진다.
초기값이 +0_𝔽이고 속성이 { [[Writable]]: false, [[Enumerable]]: false, [[Configurable]]: false }인 "length" 프로퍼티를 가진다.
값이 %Object.prototype%인 [[Prototype]] 내부 슬롯을 가진다.

명시적으로 달리 기술되지 않는 한, 아래에 정의된 String 프로토타입 객체의 메서드는 제네릭하지 않으며, 그들에게 전달되는 this 값은 String 값이거나 String 값으로 초기화된 [[StringData]] 내부 슬롯을 가진 객체여야 한다.

22.1.3.1 String.prototype.at ( `index` )

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
S를 ? ToString(O)로 둔다.
len을 S의 길이로 둔다.
relativeIndex를 ? ToIntegerOrInfinity(index)로 둔다.
relativeIndex ≥ 0이면
1. k를 relativeIndex로 둔다.
아니면
1. k를 len + relativeIndex로 둔다.
k < 0 또는 k ≥ len이면 undefined를 반환한다.
S의 k부터 k + 1 전까지 부분 문자열을 반환한다.

22.1.3.2 String.prototype.charAt ( `pos` )

Note 1

이 메서드는 이 객체를 String으로 변환한 값 내에서 인덱스 pos 위치의 코드 유닛을 포함하는 단일 요소 String을 반환한다. 해당 인덱스에 요소가 없으면 결과는 빈 문자열이다. 결과는 String 객체가 아닌 String 값이다.

pos가 정수 Number이면 x.charAt(pos)의 결과는 x.substring(pos, pos + 1) 결과와 동일하다.

이 메서드는 호출될 때 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
S를 ? ToString(O)로 둔다.
position을 ? ToIntegerOrInfinity(pos)로 둔다.
size를 S의 길이로 둔다.
position < 0 또는 position ≥ size이면 빈 문자열을 반환한다.
S의 position부터 position + 1 전까지 부분 문자열을 반환한다.

Note 2

이 메서드는 의도적으로 제네릭이다; this 값이 String 객체일 필요가 없다. 따라서 다른 종류의 객체에 이전하여 메서드로 사용할 수 있다.

22.1.3.3 String.prototype.charCodeAt ( `pos` )

Note 1

이 메서드는 이 객체를 String으로 변환한 값 내에서 인덱스 pos 위치 코드 유닛의 숫자 값을 나타내는 Number (0 이상 2¹⁶ 미만의 음이 아닌 정수 Number)를 반환한다. 해당 인덱스에 요소가 없으면 결과는 NaN이다.

이 메서드는 호출될 때 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
S를 ? ToString(O)로 둔다.
position을 ? ToIntegerOrInfinity(pos)로 둔다.
size를 S의 길이로 둔다.
position < 0 또는 position ≥ size이면 NaN을 반환한다.
S 내 인덱스 position 코드 유닛의 숫자 값에 대한 Number 값을 반환한다.

Note 2

이 메서드는 의도적으로 제네릭이다; this 값이 String 객체일 필요가 없다. 따라서 다른 종류의 객체에 이전하여 메서드로 사용할 수 있다.

22.1.3.4 String.prototype.codePointAt ( `pos` )

Note 1

이 메서드는 이 객체를 String으로 변환한 결과에서 인덱스 pos 위치의 문자열 요소에서 시작하는 UTF-16 인코딩 코드 포인트(6.1.4)의 숫자 값을 나타내는 0x10FFFF_𝔽 이하의 음이 아닌 정수 Number를 반환한다. 그 위치에 요소가 없으면 undefined를 반환한다. pos에서 유효한 UTF-16 서로게이트 쌍이 시작하지 않으면 결과는 pos의 코드 유닛이다.

이 메서드는 호출될 때 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
S를 ? ToString(O)로 둔다.
position을 ? ToIntegerOrInfinity(pos)로 둔다.
size를 S의 길이로 둔다.
position < 0 또는 position ≥ size이면 undefined를 반환한다.
cp를 CodePointAt(S, position)로 둔다.
𝔽(cp.[[CodePoint]])를 반환한다.

Note 2

이 메서드는 의도적으로 제네릭이다; this 값이 String 객체일 필요가 없다. 따라서 다른 종류의 객체에 이전하여 메서드로 사용할 수 있다.

22.1.3.5 String.prototype.concat ( ...`args` )

Note 1

이 메서드는 this 값(문자열로 변환)의 코드 유닛 뒤에 각 인수를 문자열로 변환한 코드 유닛을 이어붙인 String 값을 반환한다. 결과는 String 객체가 아닌 String 값이다.

이 메서드는 호출될 때 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
S를 ? ToString(O)로 둔다.
R을 S로 둔다.
args의 각 요소 next 에 대해
1. nextString을 ? ToString(next)로 둔다.
2. R을 R과 nextString의 문자열 결합으로 설정한다.
R을 반환한다.

이 메서드의 "length" 프로퍼티 값은 1_𝔽이다.

Note 2

이 메서드는 의도적으로 제네릭이다; this 값이 String 객체일 필요가 없다. 따라서 다른 종류의 객체에 이전하여 메서드로 사용할 수 있다.

22.1.3.6 String.prototype.constructor

String.prototype.constructor의 초기값은 %String%이다.

22.1.3.7 String.prototype.endsWith ( `searchString` [ , `endPosition` ] )

이 메서드는 호출될 때 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
S를 ? ToString(O)로 둔다.
isRegExp를 ? IsRegExp(searchString)로 둔다.
isRegExp가 true이면 TypeError 예외를 던진다.
searchStr를 ? ToString(searchString)로 둔다.
len을 S의 길이로 둔다.
endPosition이 undefined이면 pos를 len으로, 아니면 pos를 ? ToIntegerOrInfinity(endPosition)로 둔다.
end를 pos를 0과 len 사이로 클램프한 결과로 둔다.
searchLength를 searchStr의 길이로 둔다.
searchLength = 0이면 true를 반환한다.
start를 end - searchLength로 둔다.
start < 0이면 false를 반환한다.
substring을 S의 start부터 end 전까지 부분 문자열로 둔다.
substring이 searchStr이면 true를 반환한다.
false를 반환한다.

Note 1

이 메서드는 searchString의 코드 유닛 시퀀스(문자열로 변환)가 이 객체(문자열로 변환)의 해당 코드 유닛과 endPosition - length(this) 지점부터 일치하면 true를 반환한다. 그렇지 않으면 false를 반환한다.

Note 2

첫 번째 인수가 RegExp이면 예외를 던지는 것은 향후 버전에서 그러한 인수 값을 허용하는 확장을 정의할 수 있도록 하기 위한 것이다.

Note 3

이 메서드는 의도적으로 제네릭이다; this 값이 String 객체일 필요가 없다. 따라서 다른 객체로 이전하여 사용할 수 있다.

22.1.3.8 String.prototype.includes ( `searchString` [ , `position` ] )

이 메서드는 호출될 때 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
S를 ? ToString(O)로 둔다.
isRegExp를 ? IsRegExp(searchString)로 둔다.
isRegExp가 true이면 TypeError 예외를 던진다.
searchStr를 ? ToString(searchString)로 둔다.
pos를 ? ToIntegerOrInfinity(position)로 둔다.
단언: position이 undefined이면 pos는 0이다.
len을 S의 길이로 둔다.
start를 pos를 0과 len 사이로 클램프한 결과로 둔다.
index를 StringIndexOf(S, searchStr, start)로 둔다.
index가 not-found이면 false를 반환한다.
true를 반환한다.

Note 1

searchString이 이 객체를 String으로 변환한 결과의 position 이상 인덱스들에서 하나 이상 substring으로 나타나면 true를 반환하고, 아니면 false를 반환한다. position이 undefined이면 0을 가정하여 전체 문자열을 검색한다.

Note 2

첫 번째 인수가 RegExp이면 예외를 던지는 것은 향후 버전 확장을 위해서이다.

Note 3

이 메서드는 의도적으로 제네릭이다; this 값이 String 객체일 필요가 없다.

22.1.3.9 String.prototype.indexOf ( `searchString` [ , `position` ] )

Note 1

searchString이 이 객체를 String으로 변환한 결과의 position 이상 인덱스 중 하나 이상에서 substring으로 나타나면 그중 가장 작은 인덱스를 반환하고, 아니면 -1_𝔽을 반환한다. position이 undefined이면 +0_𝔽을 가정하여 전체 문자열을 검색한다.

이 메서드는 호출될 때 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
S를 ? ToString(O)로 둔다.
searchStr를 ? ToString(searchString)로 둔다.
pos를 ? ToIntegerOrInfinity(position)로 둔다.
단언: position이 undefined이면 pos는 0이다.
len을 S의 길이로 둔다.
start를 pos를 0과 len 사이로 클램프한 결과로 둔다.
result를 StringIndexOf(S, searchStr, start)로 둔다.
result가 not-found이면 -1_𝔽을 반환한다.
𝔽(result)를 반환한다.

Note 2

이 메서드는 의도적으로 제네릭이다.

22.1.3.10 String.prototype.isWellFormed ( )

이 메서드는 호출될 때 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
S를 ? ToString(O)로 둔다.
IsStringWellFormedUnicode(S)를 반환한다.

22.1.3.11 String.prototype.lastIndexOf ( `searchString` [ , `position` ] )

Note 1

searchString이 이 객체를 String으로 변환한 결과에서 position 이하 인덱스 하나 이상에서 substring으로 나타나면 그중 가장 큰 인덱스를 반환하고 아니면 -1_𝔽을 반환한다. position이 undefined이면 문자열 길이를 가정하여 전체를 검색한다.

이 메서드는 호출될 때 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
S를 ? ToString(O)으로 둔다.
searchStr을 ? ToString(searchString)으로 둔다.
numPos를 ? ToNumber(position)으로 둔다.
Assert: position이 undefined이면, numPos는 NaN이다.
numPos가 NaN이면 pos를 +∞로, 아니면 pos를 ! ToIntegerOrInfinity(numPos)로 둔다.
len을 S의 길이로 둔다.
searchLen을 searchStr의 길이로 둔다.
len < searchLen이면, -1_𝔽을 반환한다.
start를 pos를 0과 len - searchLen 사이로 클램프한 결과로 둔다.
result를 StringLastIndexOf(S, searchStr, start)로 둔다.
result가 not-found이면, -1_𝔽을 반환한다.
𝔽(result)를 반환한다.

Note 2

이 메서드는 의도적으로 제네릭이다.

22.1.3.12 String.prototype.localeCompare ( `that` [ , `reserved1` [ , `reserved2` ] ] )

ECMA-402 국제화 API를 포함하는 구현은 ECMA-402 명세에 따라 이 메서드를 구현해야 한다. 포함하지 않는 구현은 다음 명세를 사용한다:

이 메서드는 this 값(문자열로 변환된 S)과 that(문자열로 변환된 thatValue)의 구현 정의 로케일 민감 문자열 비교 결과를 나타내는 NaN이 아닌 Number를 반환한다. 결과는 호스트 환경 현재 로케일 관례에 따른 정렬 순서를 반영하며, S가 thatValue 앞이면 음수, 뒤면 양수, 그 외 경우(상대적 순서 없음) 0을 반환한다.

비교 수행 전 다음 준비 단계를 거친다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
S를 ? ToString(O)로 둔다.
thatValue를 ? ToString(that)로 둔다.

두 번째와 세 번째 선택적 매개변수 의미는 ECMA-402 명세에 정의된다; 이를 포함하지 않는 구현은 다른 해석을 부여해서는 안 된다.

실제 반환 값은 추가 정보를 인코딩할 수 있도록 구현 정의이지만, 두 인수 메서드로 간주될 때 모든 문자열에 대해 총순서를 정의하는 일관된 비교자여야 한다. 또한 이 메서드는 Unicode 표준에 따른 정규( canonical ) 등가성을 인지하고 존중하여 구분 가능하지만 정규 등가인 문자열 비교 시 +0_𝔽을 반환해야 한다.

Note 1

이 메서드 자체는 Array.prototype.sort에 직접 넘기기 적합하지 않다. 후자는 두 인수 함수를 요구한다.

Note 2

이 메서드는 호스트 환경이 제공하는 언어/로케일 민감 비교 기능을 사용할 수 있으며, 현재 로케일 규칙에 따라 비교하도록 의도되었다. 그러나 어떤 비교 능력이든 Unicode 표준의 정규 등가성은 반드시 존중해야 한다 — 예: 아래 모든 비교는 +0_𝔽을 반환해야 한다:

// Å ANGSTROM SIGN vs.
// Å LATIN CAPITAL LETTER A + COMBINING RING ABOVE
"\u212B".localeCompare("A\u030A")

// Ω OHM SIGN vs.
// Ω GREEK CAPITAL LETTER OMEGA
"\u2126".localeCompare("\u03A9")

// ṩ LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE vs.
// ṩ LATIN SMALL LETTER S + COMBINING DOT ABOVE + COMBINING DOT BELOW
"\u1E69".localeCompare("s\u0307\u0323")

// ḍ̇ LATIN SMALL LETTER D WITH DOT ABOVE + COMBINING DOT BELOW vs.
// ḍ̇ LATIN SMALL LETTER D WITH DOT BELOW + COMBINING DOT ABOVE
"\u1E0B\u0323".localeCompare("\u1E0D\u0307")

// 가 HANGUL CHOSEONG KIYEOK + HANGUL JUNGSEONG A vs.
// 가 HANGUL SYLLABLE GA
"\u1100\u1161".localeCompare("\uAC00")

정규 등가성 정의와 논의는 Unicode 표준 2, 3장 및 UAX #15, Unicode Technical Note #5, UTS #10를 참조.

Unicode 호환 등가 또는 호환 분해는 존중하지 않는 것이 권장된다.

Note 3

이 메서드는 의도적으로 제네릭이다.

22.1.3.13 String.prototype.match ( `regexp` )

이 메서드는 호출되면 다음 단계를 수행합니다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
만약 regexp가 Object이면
1. matcher를 ? GetMethod(regexp, %Symbol.match%)로 둔다.
2. 만약 matcher가 undefined가 아니면
  1. ? Call(matcher, regexp, « O »)를 반환한다.
S를 ? ToString(O)로 둔다.
rx를 ? RegExpCreate(regexp, undefined)로 둔다.
? Invoke(rx, %Symbol.match%, « S »)를 반환한다.

Note

이 메서드는 의도적으로 제네릭이다.

22.1.3.14 String.prototype.matchAll ( `regexp` )

이 메서드는 this 값이 나타내는 문자열에 대해 정규 표현식 매칭을 수행하고 매치 결과를 내는 이터레이터를 반환한다. 각 매치 결과는 첫 요소가 매치된 부분 문자열이고 이후 캡처 그룹에 매치된 부분을 담는 배열이다. 정규 표현식이 전혀 매치되지 않으면 이터레이터는 아무것도 산출하지 않는다.

호출되면 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
만약 regexp가 Object이면
1. isRegExp를 ? IsRegExp(regexp)로 둔다.
2. 만약 isRegExp가 true이면
  1. flags를 ? Get(regexp, "flags")로 둔다.
  2. ? RequireObjectCoercible(flags)를 수행한다.
  3. 만약 ? ToString(flags)가 "g"를 포함하지 않으면, TypeError 예외를 throw한다.
3. matcher를 ? GetMethod(regexp, %Symbol.matchAll%)로 둔다.
4. 만약 matcher가 undefined가 아니면
  1. ? Call(matcher, regexp, « O »)를 반환한다.
S를 ? ToString(O)로 둔다.
rx를 ? RegExpCreate(regexp, "g")로 둔다.
? Invoke(rx, %Symbol.matchAll%, « S »)를 반환한다.

Note 1

이 메서드는 의도적으로 제네릭이며 this 값이 String 객체일 필요가 없다.

Note 2

String.prototype.split과 유사하게 일반적으로 입력을 변형하지 않도록 설계되었다.

22.1.3.15 String.prototype.normalize ( [ `form` ] )

이 메서드는 호출될 때 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
S를 ? ToString(O)로 둔다.
form이 undefined이면 f를 "NFC"로 둔다.
아니면 f를 ? ToString(form)으로 둔다.
f가 "NFC", "NFD", "NFKC", "NFKD" 중 하나가 아니면 RangeError 예외를 던진다.
ns를 최신 Unicode Standard 의 Normalization Forms 에 지정된 대로 S를 f가 명명하는 정규화 형식으로 정규화한 결과 String 값으로 둔다.
ns를 반환한다.

Note

이 메서드는 의도적으로 제네릭이다.

22.1.3.16 String.prototype.padEnd ( `maxLength` [ , `fillString` ] )

이 메서드는 호출될 때 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
? StringPaddingBuiltinsImpl(O, maxLength, fillString, end)를 반환한다.

22.1.3.17 String.prototype.padStart ( `maxLength` [ , `fillString` ] )

이 메서드는 호출될 때 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
? StringPaddingBuiltinsImpl(O, maxLength, fillString, start)를 반환한다.

22.1.3.17.1 StringPaddingBuiltinsImpl ( `O`, `maxLength`, `fillString`, `placement` )

The abstract operation StringPaddingBuiltinsImpl takes arguments O (ECMAScript 언어 값), maxLength (ECMAScript 언어 값), fillString (ECMAScript 언어 값), and placement (start 또는 end) and returns 정상 완료(문자열) 또는 throw 완료. It performs the following steps when called:

S를 ? ToString(O)로 둔다.
intMaxLength를 ℝ(? ToLength(maxLength))로 둔다.
stringLength를 S의 길이로 둔다.
intMaxLength ≤ stringLength이면 S를 반환한다.
fillString이 undefined이면 fillString을 코드 유닛 0x0020 (SPACE) 하나로 이루어진 문자열 값으로 둔다.
아니면 fillString을 ? ToString(fillString)으로 둔다.
StringPad(S, intMaxLength, fillString, placement)를 반환한다.

22.1.3.17.2 StringPad ( `S`, `maxLength`, `fillString`, `placement` )

The abstract operation StringPad takes arguments S (문자열), maxLength (음이 아닌 정수), fillString (문자열), and placement (start 또는 end) and returns 문자열. It performs the following steps when called:

stringLength를 S의 길이로 둔다.
maxLength ≤ stringLength이면 S를 반환한다.
fillString이 빈 문자열이면 S를 반환한다.
fillLen을 maxLength - stringLength로 둔다.
truncatedStringFiller를 fillString을 반복 결합한 뒤 길이 fillLen으로 자른 문자열 값으로 둔다.
placement가 start이면 truncatedStringFiller와 S의 문자열 결합을 반환한다.
그렇지 않으면 S와 truncatedStringFiller의 문자열 결합을 반환한다.

Note 1

인수 maxLength는 S의 길이보다 작아질 수 없도록 클램프된다.

Note 2

인수 fillString의 기본값은 " " (코드 유닛 0x0020 SPACE 하나)이다.

22.1.3.17.3 ToZeroPaddedDecimalString ( `n`, `minLength` )

The abstract operation ToZeroPaddedDecimalString takes arguments n (음이 아닌 정수) and minLength (음이 아닌 정수) and returns 문자열. It performs the following steps when called:

S를 n의 10진수 형식 문자열 표현으로 둔다.
StringPad(S, minLength, "0", start)를 반환한다.

22.1.3.18 String.prototype.repeat ( `count` )

이 메서드는 호출될 때 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
S를 ? ToString(O)로 둔다.
n을 ? ToIntegerOrInfinity(count)로 둔다.
n < 0 또는 n = +∞이면 RangeError 예외를 던진다.
n = 0이면 빈 문자열을 반환한다.
S를 n번 이어붙여 만든 문자열 값을 반환한다.

Note 1

이 메서드는 this 값을 문자열로 변환한 것의 코드 유닛을 count 번 반복한 String 값을 만든다.

Note 2

이 메서드는 의도적으로 제네릭이다.

22.1.3.19 String.prototype.replace ( `searchValue`, `replaceValue` )

이 메서드는 호출될 때 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
만약 searchValue가 Object이면
1. replacer를 ? GetMethod(searchValue, %Symbol.replace%)로 둔다.
2. 만약 replacer가 undefined가 아니면
  1. ? Call(replacer, searchValue, « O, replaceValue »)를 반환한다.
string을 ? ToString(O)로 둔다.
searchString을 ? ToString(searchValue)로 둔다.
functionalReplace를 IsCallable(replaceValue)로 둔다.
만약 functionalReplace가 false이면
1. replaceValue를 ? ToString(replaceValue)로 설정한다.
searchLength를 searchString의 길이로 둔다.
position을 StringIndexOf(string, searchString, 0)로 둔다.
만약 position이 not-found이면, string을 반환한다.
preceding을 string의 0부터 position까지의 부분 문자열로 둔다.
following을 string의 position + searchLength부터의 부분 문자열로 둔다.
만약 functionalReplace가 true이면
1. replacement을 ? ToString(? Call(replaceValue, undefined, « searchString, 𝔽(position), string »))로 둔다.
그렇지 않으면,
1. Assert: replaceValue는 String이다.
2. captures를 새로운 빈 List로 둔다.
3. replacement을 ! GetSubstitution(searchString, string, position, captures, undefined, replaceValue)로 둔다.
preceding, replacement, 그리고 following을 연결한 문자열을 반환한다.

Note

이 메서드는 의도적으로 제네릭이다.

22.1.3.19.1 GetSubstitution ( `matched`, `str`, `position`, `captures`, `namedCaptures`, `replacementTemplate` )

The abstract operation GetSubstitution takes arguments matched (문자열), str (문자열), position (음이 아닌 정수), captures (문자열 또는 undefined 요소를 갖는 리스트), namedCaptures (객체 또는 undefined), and replacementTemplate (문자열) and returns 정상 완료(문자열) 또는 throw 완료. 이 추상 연산에서 10진 숫자(digital digit)는 0x0030 (DIGIT ZERO)부터 0x0039 (DIGIT NINE)까지 포함 구간의 코드 유닛이다. It performs the following steps when called:

stringLength를 str의 길이로 둔다.
단언: position ≤ stringLength.
result를 빈 문자열로 둔다.
templateRemainder를 replacementTemplate로 둔다.
templateRemainder가 빈 문자열이 아닐 동안 반복,
1. 주: 다음 단계들은 ref ( templateRemainder의 접두사 )를 분리하고 refReplacement (그 치환)를 결정한 뒤 result에 추가한다.
2. templateRemainder가 "$$"로 시작하면
  1. ref를 "$$"로.
  2. refReplacement를 "$"로.
3. 아니고 "$`"로 시작하면
  1. ref를 "$`"로.
  2. refReplacement를 str의 0부터 position 전까지 부분 문자열로.
4. 아니고 "$&"로 시작하면
  1. ref를 "$&"로.
  2. refReplacement를 matched로.
5. 아니고 "$'" (0x0024 DOLLAR SIGN + 0x0027 APOSTROPHE)로 시작하면
  1. ref를 "$'"로.
  2. matchLength를 matched의 길이로.
  3. tailPos를 position + matchLength로.
  4. refReplacement를 str의 min(tailPos, stringLength)부터 끝까지 부분 문자열로.
  5. 참고: tailPos가 stringLength를 넘을 수 있는 경우는 이 추상 연산이 본래 내장 %RegExp.prototype.exec%가 아닌 "exec" 프로퍼티를 가진 객체에서 %RegExp.prototype%의 본래 %Symbol.replace% 호출로 유래했을 때뿐이다.
6. 아니고 "$" 뒤에 1개 이상의 10진 숫자로 시작하면
  1. "$" 뒤에 두 자리 이상 숫자로 시작하면 digitCount를 2로, 아니면 1로 둔다.
  2. digits를 templateRemainder의 1부터 1 + digitCount 전까지 부분 문자열로.
  3. index를 ℝ(StringToNumber(digits))로.
  4. 단언: 0 ≤ index ≤ 99.
  5. captureLen을 captures의 요소 개수로.
  6. index > captureLen 이고 digitCount = 2 이면
    1. 주: 두 자리 패턴이 캡처 개수를 넘는 인덱스를 지정하면 한 자리 패턴 + 리터럴 숫자로 취급된다.
    2. digitCount를 1로.
    3. digits를 digits의 0부터 1 전까지 부분 문자열로.
    4. index를 ℝ(StringToNumber(digits))로.
  7. ref를 templateRemainder의 0부터 1 + digitCount 전까지 부분 문자열로.
  8. 1 ≤ index ≤ captureLen 이면
    1. capture를 captures[index - 1]로.
    2. capture가 undefined이면
      1. refReplacement를 빈 문자열로.
    3. 아니면
      1. refReplacement를 capture로.
  9. 아니면
    1. refReplacement를 ref로.
7. 아니고 "$<"로 시작하면
  1. gtPos를 StringIndexOf(templateRemainder, ">", 0)으로.
  2. gtPos가 not-found 이거나 namedCaptures가 undefined이면
    1. ref를 "$<"로.
    2. refReplacement를 ref로.
  3. 아니면
    1. ref를 templateRemainder의 0부터 gtPos + 1 전까지 부분 문자열로.
    2. groupName을 templateRemainder의 2부터 gtPos 전까지 부분 문자열로.
    3. 단언: namedCaptures는 객체이다.
    4. capture를 ? Get(namedCaptures, groupName)으로.
    5. capture가 undefined이면
      1. refReplacement를 빈 문자열로.
    6. 아니면
      1. refReplacement를 ? ToString(capture)로.
8. 아니면
  1. ref를 templateRemainder의 0부터 1 전까지 부분 문자열로.
  2. refReplacement를 ref로.
9. refLength를 ref의 길이로.
10. templateRemainder를 templateRemainder의 refLength부터 끝까지 부분 문자열로 둔다.
11. result를 result와 refReplacement의 문자열 결합으로 설정한다.
result를 반환한다.

22.1.3.20 String.prototype.replaceAll ( `searchValue`, `replaceValue` )

이 메서드는 호출될 때 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
searchValue가 Object이면
1. isRegExp를 ? IsRegExp(searchValue)로 둔다.
2. isRegExp가 true이면
  1. flags를 ? Get(searchValue, "flags")로 둔다.
  2. ? RequireObjectCoercible(flags)를 수행한다.
  3. ? ToString(flags)에 "g"가 없으면 TypeError 예외를 던진다.
3. replacer를 ? GetMethod(searchValue, %Symbol.replace%)로 둔다.
4. replacer가 undefined가 아니면
  1. ? Call(replacer, searchValue, « O, replaceValue »)를 반환한다.
string을 ? ToString(O)로 둔다.
searchString을 ? ToString(searchValue)로 둔다.
functionalReplace를 IsCallable(replaceValue)로 둔다.
functionalReplace가 false이면
1. replaceValue를 ? ToString(replaceValue)로 둔다.
searchLength를 searchString의 길이로 둔다.
advanceBy를 max(1, searchLength)로 둔다.
matchPositions를 새 빈 리스트로 둔다.
position을 StringIndexOf(string, searchString, 0)으로 둔다.
position이 not-found가 아닐 동안 반복,
1. matchPositions에 position을 추가한다.
2. position을 StringIndexOf(string, searchString, position + advanceBy)로 둔다.
endOfLastMatch를 0으로 둔다.
result를 빈 문자열로 둔다.
matchPositions의 각 요소 p 에 대해
1. preserved를 string의 endOfLastMatch부터 p 전까지 부분 문자열로 둔다.
2. functionalReplace가 true이면
  1. replacement를 ? ToString(? Call(replaceValue, undefined, « searchString, 𝔽(p), string »))로 둔다.
3. 아니면
  1. 단언: replaceValue는 문자열이다.
  2. captures를 새 빈 리스트로 둔다.
  3. replacement를 ! GetSubstitution(searchString, string, p, captures, undefined, replaceValue)로 둔다.
4. result를 result, preserved, replacement의 문자열 결합으로 설정한다.
5. endOfLastMatch를 p + searchLength로 둔다.
endOfLastMatch < string 길이면
1. result를 result와 string의 endOfLastMatch부터 끝까지 부분 문자열 결합으로 설정한다.
result를 반환한다.

22.1.3.21 String.prototype.search ( `regexp` )

이 메서드는 호출될 때 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
regexp가 Object이면
1. searcher를 ? GetMethod(regexp, %Symbol.search%)로 둔다.
2. searcher가 undefined가 아니면
  1. ? Call(searcher, regexp, « O »)를 반환한다.
string을 ? ToString(O)로 둔다.
rx를 ? RegExpCreate(regexp, undefined)로 둔다.
? Invoke(rx, %Symbol.search%, « string »)를 반환한다.

Note

이 메서드는 의도적으로 제네릭이다.

22.1.3.22 String.prototype.slice ( `start`, `end` )

이 메서드는 이 객체를 문자열로 변환한 결과에서 인덱스 start부터 end (포함하지 않음) 전까지(end가 undefined이면 끝까지) substring을 반환한다. start가 음수면 sourceLength + start로 처리하고, end가 음수면 sourceLength + end로 처리한다 (sourceLength는 문자열 길이). 결과는 String 객체가 아닌 String 값이다.

호출되면 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
S를 ? ToString(O)로 둔다.
len을 S의 길이로 둔다.
intStart를 ? ToIntegerOrInfinity(start)로 둔다.
intStart = -∞이면 from을 0으로.
아니고 intStart < 0이면 from을 max(len + intStart, 0)으로.
아니면 from을 min(intStart, len)으로.
end가 undefined이면 intEnd를 len으로, 아니면 intEnd를 ? ToIntegerOrInfinity(end)로.
intEnd = -∞이면 to를 0으로.
아니고 intEnd < 0이면 to를 max(len + intEnd, 0)으로.
아니면 to를 min(intEnd, len)으로.
from ≥ to이면 빈 문자열을 반환한다.
S의 from부터 to 전까지 부분 문자열을 반환한다.

Note

이 메서드는 의도적으로 제네릭이다.

22.1.3.23 String.prototype.split ( `separator`, `limit` )

이 메서드는 이 객체를 문자열로 변환한 결과를 분할해 그 부분 문자열들을 담은 배열을 반환한다. 왼쪽에서 오른쪽으로 separator 발생 지점을 찾으며, 이 지점들은 결과 배열 문자열 일부가 아니고 경계를 나눈다. separator 값은 임의 길이의 문자열이거나 %Symbol.split% 메서드를 가진 객체(예: RegExp)일 수 있다.

호출되면 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
separator가 Object이면
1. splitter를 ? GetMethod(separator, %Symbol.split%)로 둔다.
2. splitter가 undefined가 아니면
  1. ? Call(splitter, separator, « O, limit »)를 반환한다.
S를 ? ToString(O)로 둔다.
limit이 undefined이면 lim을 2³² - 1로, 아니면 lim을 ℝ(? ToUint32(limit))로.
R을 ? ToString(separator)로 둔다.
lim = 0이면
1. CreateArrayFromList(« »)를 반환한다.
separator가 undefined이면
1. CreateArrayFromList(« S »)를 반환한다.
separatorLength를 R의 길이로 둔다.
separatorLength = 0이면
1. strLen을 S의 길이로.
2. outLen을 lim을 0과 strLen 사이로 클램프한 결과로.
3. head를 S의 0부터 outLen 전까지 부분 문자열로.
4. codeUnits를 head 요소 코드 유닛 시퀀스로 이루어진 리스트로.
5. CreateArrayFromList(codeUnits)를 반환한다.
S가 빈 문자열이면 CreateArrayFromList(« S »)를 반환한다.
substrings를 새 빈 리스트로 둔다.
i를 0으로 둔다.
j를 StringIndexOf(S, R, 0)으로 둔다.
j가 not-found가 아닐 동안 반복,
1. T를 S의 i부터 j 전까지 부분 문자열로 둔다.
2. substrings에 T를 추가한다.
3. substrings 요소 수가 lim이면 CreateArrayFromList(substrings)를 반환한다.
4. i를 j + separatorLength로 둔다.
5. j를 StringIndexOf(S, R, i)로 둔다.
T를 S의 i부터 끝까지 부분 문자열로 둔다.
substrings에 T를 추가한다.
CreateArrayFromList(substrings)를 반환한다.

Note 1

separator 값은 빈 문자열일 수 있다. 이 경우 separator는 시작/끝의 빈 substring이나 직전 매치 후의 빈 substring와 매치하지 않는다. separator가 빈 문자열이면 문자열은 개별 코드 유닛 요소로 분리되며 결과 배열 길이는 문자열 길이와 같고 각 substring은 한 코드 유닛을 포함한다.

this 값이(또는 변환 결과가) 빈 문자열이면 separator가 빈 문자열과 매치 가능한지 여부에 따라 결과가 달라진다. 매치 가능하면 결과 배열은 비어 있다. 아니면 빈 문자열 하나를 요소로 가진다.

separator가 undefined이면 결과 배열은 this 값(문자열로 변환) 하나만 가진다. limit이 undefined가 아니면 결과 배열은 limit 요소를 넘지 않도록 잘린다.

Note 2

이 메서드는 의도적으로 제네릭이다.

22.1.3.24 String.prototype.startsWith ( `searchString` [ , `position` ] )

이 메서드는 호출될 때 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
S를 ? ToString(O)로 둔다.
isRegExp를 ? IsRegExp(searchString)로 둔다.
isRegExp가 true이면 TypeError 예외를 던진다.
searchStr를 ? ToString(searchString)로 둔다.
len을 S의 길이로 둔다.
position이 undefined이면 pos를 0으로, 아니면 pos를 ? ToIntegerOrInfinity(position)로 둔다.
start를 pos를 0과 len 사이로 클램프한 결과로 둔다.
searchLength를 searchStr의 길이로 둔다.
searchLength = 0이면 true를 반환한다.
end를 start + searchLength로 둔다.
end > len이면 false를 반환한다.
substring을 S의 start부터 end 전까지 부분 문자열로 둔다.
substring이 searchStr이면 true를 반환한다.
false를 반환한다.

Note 1

이 메서드는 searchString이 문자열로 변환된 코드 유닛 시퀀스가 position 인덱스에서 시작하는 이 객체(문자열 변환)의 대응 코드 유닛과 같으면 true를 반환한다.

Note 2

첫 번째 인수가 RegExp이면 예외를 던지는 것은 향후 확장을 위해서이다.

Note 3

이 메서드는 의도적으로 제네릭이다.

22.1.3.25 String.prototype.substring ( `start`, `end` )

이 메서드는 이 객체를 String으로 변환한 결과에서 인덱스 start부터 end (포함하지 않음) 전까지(end가 undefined이면 끝까지) substring을 반환한다. 결과는 String 객체가 아닌 String 값이다.

어느 인수라도 NaN 또는 음수면 0으로 대체되고, 어느 인수가 문자열 길이를 초과하면 그 길이로 대체된다.

start > end이면 둘을 교환한다.

호출되면 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
S를 ? ToString(O)로 둔다.
len을 S의 길이로 둔다.
intStart를 ? ToIntegerOrInfinity(start)로 둔다.
end가 undefined이면 intEnd를 len으로, 아니면 intEnd를 ? ToIntegerOrInfinity(end)로 둔다.
finalStart를 intStart를 0과 len 사이로 클램프한 결과로 둔다.
finalEnd를 intEnd를 0과 len 사이로 클램프한 결과로 둔다.
from을 min(finalStart, finalEnd)로 둔다.
to를 max(finalStart, finalEnd)로 둔다.
S의 from부터 to 전까지 부분 문자열을 반환한다.

Note

이 메서드는 의도적으로 제네릭이다.

22.1.3.26 String.prototype.toLocaleLowerCase ( [ `reserved1` [ , `reserved2` ] ] )

ECMA-402 국제화 API를 포함한 구현은 ECMA-402 명세에 따라 구현해야 한다. 포함하지 않으면 다음 명세를 따른다:

이 메서드는 6.1.4에 설명된 대로 문자열 값을 UTF-16 인코딩 코드 포인트 시퀀스로 해석한다.

toLowerCase와 동일하게 동작하지만 로케일에 민감한 결과(예: 터키어의 특수 규칙)를 낸다.

선택적 매개변수 의미는 ECMA-402 명세에 정의되며, 이를 지원하지 않는 구현은 다른 의미를 부여해서는 안 된다.

Note

이 메서드는 의도적으로 제네릭이다.

22.1.3.27 String.prototype.toLocaleUpperCase ( [ `reserved1` [ , `reserved2` ] ] )

ECMA-402 국제화 API를 포함한 구현은 ECMA-402 명세에 따라 구현해야 한다. 포함하지 않으면 다음 명세를 따른다:

이 메서드는 문자열 값을 UTF-16 인코딩 코드 포인트 시퀀스로 해석한다.

toUpperCase와 동일하게 동작하지만 로케일 민감한 결과를 낸다.

선택적 매개변수 의미는 ECMA-402 명세에 정의되며, 비지원 구현은 다른 의미를 부여할 수 없다.

Note

이 메서드는 의도적으로 제네릭이다.

22.1.3.28 String.prototype.toLowerCase ( )

이 메서드는 6.1.4에 설명된 대로 문자열 값을 UTF-16 코드 포인트 시퀀스로 해석한다.

호출되면 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
S를 ? ToString(O)로 둔다.
sText를 StringToCodePoints(S)로 둔다.
lowerText를 Unicode 기본 대소문자 변환 알고리즘에 따라 toLowercase(sText)로 둔다.
L을 CodePointsToString(lowerText)로 둔다.
L을 반환한다.

결과는 Unicode Character Database의 로케일 비의존 매핑( UnicodeData.txt 및 그것과 동반하는 SpecialCasing.txt 의 로케일 비의존 매핑 )을 따라 파생되어야 한다.

Note 1

몇몇 코드 포인트의 대소문자 매핑은 여러 코드 포인트를 생성할 수 있다. 이 경우 결과 문자열 길이는 원본과 다를 수 있다. toUpperCase와 toLowerCase는 문맥 민감하므로 대칭이 아니다.

Note 2

이 메서드는 의도적으로 제네릭이다.

22.1.3.29 String.prototype.toString ( )

이 메서드는 호출될 때 다음 단계를 수행한다:

? ThisStringValue(this value)을 반환한다.

Note

String 객체의 경우 이 메서드는 valueOf와 동일한 것을 반환한다.

22.1.3.30 String.prototype.toUpperCase ( )

이 메서드는 문자열 값을 UTF-16 코드 포인트 시퀀스로 해석한다.

Unicode 기본 대소문자 변환의 toUppercase 알고리즘을 사용한다는 점만 제외하고 String.prototype.toLowerCase와 동일하게 동작한다.

Note

이 메서드는 의도적으로 제네릭이다.

22.1.3.31 String.prototype.toWellFormed ( )

이 메서드는 고립된 선행 또는 후행 서로게이트(서로게이트 쌍을 이루지 않는)를 U+FFFD (REPLACEMENT CHARACTER)로 치환한 이 객체의 문자열 표현을 반환한다.

호출되면 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
S를 ? ToString(O)로 둔다.
strLen을 S의 길이로 둔다.
k를 0으로 둔다.
result를 빈 문자열로 둔다.
k < strLen 동안 반복,
1. cp를 CodePointAt(S, k)로 둔다.
2. cp.[[IsUnpairedSurrogate]]가 true이면
  1. result를 result와 0xFFFD (REPLACEMENT CHARACTER) 결합으로 둔다.
3. 아니면
  1. result를 result와 UTF16EncodeCodePoint(cp.[[CodePoint]]) 결합으로 둔다.
4. k를 k + cp.[[CodeUnitCount]]로 둔다.
result를 반환한다.

22.1.3.32 String.prototype.trim ( )

이 메서드는 문자열 값을 UTF-16 코드 포인트 시퀀스로 해석한다.

호출되면 다음 단계를 수행한다:

S를 this 값으로 둔다.
? TrimString(S, start+end)를 반환한다.

Note

이 메서드는 의도적으로 제네릭이다.

22.1.3.32.1 TrimString ( `string`, `where` )

The abstract operation TrimString takes arguments string (ECMAScript 언어 값) and where (start, end, 또는 start+end) and returns 정상 완료(문자열) 또는 throw 완료. string을 UTF-16 인코딩 코드 포인트 시퀀스로 해석한다. It performs the following steps when called:

? RequireObjectCoercible(string)를 수행한다.
S를 ? ToString(string)으로 둔다.
where가 start이면
1. T를 S의 선행 공백을 제거한 복사본으로 둔다.
아니고 where가 end이면
1. T를 S의 후행 공백을 제거한 복사본으로 둔다.
아니면
1. 단언: where는 start+end이다.
2. T를 S의 선행 및 후행 공백을 제거한 복사본으로 둔다.
T를 반환한다.

공백 정의는 WhiteSpace와 LineTerminator의 합집합이다. Unicode 일반 카테고리 “Space_Separator”(“Zs”) 판별 시 코드 유닛 시퀀스는 6.1.4에 규정된 UTF-16 인코딩 코드 포인트 시퀀스로 해석된다.

22.1.3.33 String.prototype.trimEnd ( )

이 메서드는 문자열 값을 UTF-16 코드 포인트 시퀀스로 해석한다.

호출되면 다음 단계를 수행한다:

S를 this 값으로 둔다.
? TrimString(S, end)를 반환한다.

Note

이 메서드는 의도적으로 제네릭이다.

22.1.3.34 String.prototype.trimStart ( )

이 메서드는 문자열 값을 UTF-16 코드 포인트 시퀀스로 해석한다.

호출되면 다음 단계를 수행한다:

S를 this 값으로 둔다.
? TrimString(S, start)를 반환한다.

Note

이 메서드는 의도적으로 제네릭이다.

22.1.3.35 String.prototype.valueOf ( )

이 메서드는 호출될 때 다음 단계를 수행한다:

? ThisStringValue(this value)을 반환한다.

22.1.3.35.1 ThisStringValue ( `value` )

The abstract operation ThisStringValue takes argument value (ECMAScript 언어 값) and returns 정상 완료(문자열) 또는 throw 완료. It performs the following steps when called:

value가 문자열이면 value를 반환한다.
value가 객체이고 [[StringData]] 내부 슬롯을 가지면
1. s를 value.[[StringData]]로 둔다.
2. 단언: s는 문자열이다.
3. s를 반환한다.
TypeError 예외를 던진다.

22.1.3.36 String.prototype [ %Symbol.iterator% ] ( )

이 메서드는 문자열 값의 코드 포인트를 순회하며 각 코드 포인트를 문자열 값으로 반환하는 이터레이터 객체를 반환한다.

호출되면 다음 단계를 수행한다:

O를 this 값으로 둔다.
? RequireObjectCoercible(O)를 수행한다.
s를 ? ToString(O)로 둔다.
closure를 s를 캡처하는 파라미터 없는 새 추상 클로저로 두고 호출 시 다음 단계를 수행하게 한다:
1. len을 s의 길이로 둔다.
2. position을 0으로 둔다.
3. position < len 동안 반복,
  1. cp를 CodePointAt(s, position)로 둔다.
  2. nextIndex를 position + cp.[[CodeUnitCount]]로 둔다.
  3. resultString을 s의 position부터 nextIndex 전까지 부분 문자열로 둔다.
  4. position을 nextIndex로 둔다.
  5. ? GeneratorYield(CreateIteratorResultObject(resultString, false))를 수행한다.
4. NormalCompletion(unused)을 반환한다.
CreateIteratorFromClosure(closure, "%StringIteratorPrototype%", %StringIteratorPrototype%)를 반환한다.

이 메서드의 "name" 프로퍼티 값은 "[Symbol.iterator]"이다.

22.1.4 String 인스턴스의 프로퍼티

String 인스턴스는 String 특수(exotic) 객체이며 해당 객체에 지정된 내부 메서드를 가진다. String 인스턴스는 String 프로토타입 객체로부터 프로퍼티를 상속한다. 또한 [[StringData]] 내부 슬롯을 가지며, 이는 이 String 객체가 표현하는 문자열 값이다.

String 인스턴스는 "length" 프로퍼티와 정수 인덱스 이름을 가진 열거 가능한 프로퍼티 집합을 가진다.

22.1.4.1 length

이 String 객체가 표현하는 문자열 값의 요소 개수.

String 객체가 초기화된 후 이 프로퍼티는 변하지 않는다. 속성은 { [[Writable]]: false, [[Enumerable]]: false, [[Configurable]]: false }이다.

22.1.5 String 이터레이터 객체

String Iterator는 특정 String 인스턴스 객체에 대한 특정 순회를 나타내는 객체이다. String Iterator 객체에 대한 명명된 생성자는 없다. 대신 특정 String 인스턴스의 메서드를 호출하여 생성된다.

22.1.5.1 %StringIteratorPrototype% 객체

%StringIteratorPrototype% 객체:

모든 String Iterator 객체가 상속하는 프로퍼티를 가진다.
일반 객체이다.
[[Prototype]] 내부 슬롯 값이 %Iterator.prototype%이다.
다음 프로퍼티들을 가진다:

22.1.5.1.1 %StringIteratorPrototype%.next ( )

? GeneratorResume(this value, empty, "%StringIteratorPrototype%")를 반환한다.

22.1.5.1.2 %StringIteratorPrototype% [ %Symbol.toStringTag% ]

%Symbol.toStringTag% 프로퍼티 초기값은 문자열 "String Iterator"이다.

이 프로퍼티는 속성 { [[Writable]]: false, [[Enumerable]]: false, [[Configurable]]: true }를 가진다.

22.2 RegExp (정규 표현식) 객체

RegExp 객체는 하나의 정규 표현식과 그에 연관된 플래그들을 포함한다.

Note

정규 표현식의 형식과 기능은 Perl 5 프로그래밍 언어의 정규 표현식 기능을 본뜬 것이다.

22.2.1 패턴 (Patterns)

RegExp 생성자는 입력 패턴 문자열에 아래의 문법을 적용한다. 해당 문법이 그 문자열을 Pattern 의 전개로 해석할 수 없으면 오류가 발생한다.

구문 (Syntax)

Pattern

[UnicodeMode, UnicodeSetsMode, NamedCaptureGroups]

Disjunction

[?UnicodeMode, ?UnicodeSetsMode, ?NamedCaptureGroups]

Disjunction

[UnicodeMode, UnicodeSetsMode, NamedCaptureGroups]

Alternative

[?UnicodeMode, ?UnicodeSetsMode, ?NamedCaptureGroups]

Alternative

[?UnicodeMode, ?UnicodeSetsMode, ?NamedCaptureGroups]

Disjunction

[?UnicodeMode, ?UnicodeSetsMode, ?NamedCaptureGroups]

Alternative

[UnicodeMode, UnicodeSetsMode, NamedCaptureGroups]

[empty]

Alternative

[?UnicodeMode, ?UnicodeSetsMode, ?NamedCaptureGroups]

Term

[?UnicodeMode, ?UnicodeSetsMode, ?NamedCaptureGroups]

Term

[UnicodeMode, UnicodeSetsMode, NamedCaptureGroups]

Assertion

[?UnicodeMode, ?UnicodeSetsMode, ?NamedCaptureGroups]

Atom

[?UnicodeMode, ?UnicodeSetsMode, ?NamedCaptureGroups]

Atom

[?UnicodeMode, ?UnicodeSetsMode, ?NamedCaptureGroups]

Quantifier

Assertion

[UnicodeMode, UnicodeSetsMode, NamedCaptureGroups]

(?=

Disjunction

[?UnicodeMode, ?UnicodeSetsMode, ?NamedCaptureGroups]

)

(?!

Disjunction

[?UnicodeMode, ?UnicodeSetsMode, ?NamedCaptureGroups]

)

(?<=

Disjunction

[?UnicodeMode, ?UnicodeSetsMode, ?NamedCaptureGroups]

)

(?<!

Disjunction

[?UnicodeMode, ?UnicodeSetsMode, ?NamedCaptureGroups]

)

Quantifier

QuantifierPrefix

{

DecimalDigits

[~Sep]

}

{

DecimalDigits

[~Sep]

{

DecimalDigits

[~Sep]

DecimalDigits

[~Sep]

}

Atom

[UnicodeMode, UnicodeSetsMode, NamedCaptureGroups]

PatternCharacter

AtomEscape

[?UnicodeMode, ?NamedCaptureGroups]

CharacterClass

[?UnicodeMode, ?UnicodeSetsMode]

(

GroupSpecifier

[?UnicodeMode]

opt

Disjunction

[?UnicodeMode, ?UnicodeSetsMode, ?NamedCaptureGroups]

)

RegularExpressionModifiers

Disjunction

[?UnicodeMode, ?UnicodeSetsMode, ?NamedCaptureGroups]

)

RegularExpressionModifiers

Disjunction

[?UnicodeMode, ?UnicodeSetsMode, ?NamedCaptureGroups]

)

RegularExpressionModifiers

[empty]

RegularExpressionModifiers

RegularExpressionModifier

one of

SyntaxCharacter

one of

(

)

[

]

{

}

PatternCharacter

SourceCharacter

but not SyntaxCharacter

AtomEscape

[UnicodeMode, NamedCaptureGroups]

DecimalEscape

CharacterClassEscape

[?UnicodeMode]

CharacterEscape

[?UnicodeMode]

[+NamedCaptureGroups]

[?UnicodeMode]

[UnicodeMode]

[lookahead ∉ DecimalDigit]

HexEscapeSequence

RegExpUnicodeEscapeSequence

[?UnicodeMode]

[?UnicodeMode]

one of

[UnicodeMode]

[?UnicodeMode]

[UnicodeMode]

[?UnicodeMode]

[UnicodeMode]

RegExpIdentifierStart

[?UnicodeMode]

RegExpIdentifierName

[?UnicodeMode]

RegExpIdentifierPart

[?UnicodeMode]

RegExpIdentifierStart

[UnicodeMode]

IdentifierStartChar

RegExpUnicodeEscapeSequence

[+UnicodeMode]

[~UnicodeMode]

UnicodeLeadSurrogate

UnicodeTrailSurrogate

RegExpIdentifierPart

[UnicodeMode]

IdentifierPartChar

RegExpUnicodeEscapeSequence

[+UnicodeMode]

[~UnicodeMode]

UnicodeLeadSurrogate

UnicodeTrailSurrogate

RegExpUnicodeEscapeSequence

[UnicodeMode]

[+UnicodeMode]

[+UnicodeMode]

[+UnicodeMode]

[+UnicodeMode]

[~UnicodeMode]

[+UnicodeMode]

}

any Unicode code point in the inclusive interval from U+D800 to U+DBFF

UnicodeTrailSurrogate

any Unicode code point in the inclusive interval from U+DC00 to U+DFFF

연결될 수 있는 \u HexLeadSurrogate 의 선택이 모호한 각 \u HexTrailSurrogate 는, 그렇지 않으면 해당하는 \u HexTrailSurrogate 를 갖지 않게 될 가장 가까운 가능한 u HexLeadSurrogate 와 연결되어야 한다.

HexLeadSurrogate

Hex4Digits

but only if the MV of Hex4Digits is in the inclusive interval from 0xD800 to 0xDBFF

HexTrailSurrogate

Hex4Digits

but only if the MV of Hex4Digits is in the inclusive interval from 0xDC00 to 0xDFFF

HexNonSurrogate

Hex4Digits

but only if the MV of Hex4Digits is not in the inclusive interval from 0xD800 to 0xDFFF

IdentityEscape

[UnicodeMode]

[+UnicodeMode]

SyntaxCharacter

[+UnicodeMode]

[~UnicodeMode]

SourceCharacter

but not UnicodeIDContinue

DecimalEscape

NonZeroDigit

DecimalDigits

[~Sep]

opt

[lookahead ∉ DecimalDigit]

CharacterClassEscape

[UnicodeMode]

[+UnicodeMode]

UnicodePropertyValueExpression

}

[+UnicodeMode]

UnicodePropertyValueExpression

}

UnicodePropertyValueExpression

UnicodePropertyName

UnicodePropertyValue

LoneUnicodePropertyNameOrValue

UnicodePropertyName

UnicodePropertyNameCharacters

UnicodePropertyNameCharacter

UnicodePropertyNameCharacters

opt

UnicodePropertyValue

UnicodePropertyValueCharacters

LoneUnicodePropertyNameOrValue

UnicodePropertyValueCharacters

UnicodePropertyValueCharacter

UnicodePropertyValueCharacters

opt

UnicodePropertyValueCharacter

UnicodePropertyNameCharacter

DecimalDigit

UnicodePropertyNameCharacter

AsciiLetter

CharacterClass

[UnicodeMode, UnicodeSetsMode]

[

[lookahead ≠ ^]

ClassContents

[?UnicodeMode, ?UnicodeSetsMode]

]

ClassContents

[?UnicodeMode, ?UnicodeSetsMode]

]

ClassContents

[UnicodeMode, UnicodeSetsMode]

[empty]

[~UnicodeSetsMode]

NonemptyClassRanges

[?UnicodeMode]

[+UnicodeSetsMode]

[UnicodeMode]

[?UnicodeMode]

[?UnicodeMode]

NonemptyClassRangesNoDash

[?UnicodeMode]

ClassAtom

[?UnicodeMode]

ClassAtom

[?UnicodeMode]

ClassContents

[?UnicodeMode, ~UnicodeSetsMode]

NonemptyClassRangesNoDash

[UnicodeMode]

ClassAtom

[?UnicodeMode]

ClassAtomNoDash

[?UnicodeMode]

NonemptyClassRangesNoDash

[?UnicodeMode]

ClassAtomNoDash

[?UnicodeMode]

ClassAtom

[?UnicodeMode]

ClassContents

[?UnicodeMode, ~UnicodeSetsMode]

[UnicodeMode]

[?UnicodeMode]

[UnicodeMode]

but not one of \ or ] or -

ClassEscape

[?UnicodeMode]

ClassEscape

[UnicodeMode]

[+UnicodeMode]

[?UnicodeMode]

[?UnicodeMode]

opt

opt

[lookahead ≠ &]

[lookahead ≠ &]

ClassStringDisjunction

ClassSetCharacter

NestedClass

[

[lookahead ≠ ^]

ClassContents

[+UnicodeMode, +UnicodeSetsMode]

]

ClassContents

[+UnicodeMode, +UnicodeSetsMode]

]

CharacterClassEscape

[+UnicodeMode]

Note 1

여기 처음 두 줄은 CharacterClass 와 동등하다.

ClassStringDisjunction

\q{

ClassStringDisjunctionContents

}

ClassStringDisjunctionContents

ClassString

ClassStringDisjunctionContents

[empty]

opt

[lookahead ∉ ClassSetReservedDoublePunctuator]

SourceCharacter

but not ClassSetSyntaxCharacter

CharacterEscape

[+UnicodeMode]

ClassSetReservedPunctuator

ClassSetReservedDoublePunctuator

one of

;;

ClassSetSyntaxCharacter

one of

(

)

[

]

{

}

ClassSetReservedPunctuator

one of

;

Note 2

이 절의 여러 생성 규칙들은 B.1.2 절에서 대체 정의를 가진다.

22.2.1.1 정적 의미론: 조기 오류 (Early Errors)

Note

이 절은 B.1.2.1 에서 수정된다.

Pattern

Disjunction

CountLeftCapturingParensWithin(Pattern) ≥ 2³² - 1 이면 구문 오류이다.
Pattern 이 서로 다른 두 GroupSpecifier x 와 y 를 포함하고, x 의 CapturingGroupName 이 y 의 CapturingGroupName 과 같으며 MightBothParticipate(x, y) 가 true 이면 구문 오류이다.

QuantifierPrefix

{

DecimalDigits

}

첫 번째 DecimalDigits 의 MV 가 두 번째 DecimalDigits 의 MV 보다 엄격히 크면 구문 오류이다.

Atom

RegularExpressionModifiers

Disjunction

)

RegularExpressionModifiers 가 매치한 소스 텍스트에 동일한 코드 포인트가 두 번 이상 나타나면 구문 오류이다.

Atom

RegularExpressionModifiers

Disjunction

)

첫 번째 RegularExpressionModifiers 와 두 번째 RegularExpressionModifiers 가 매치한 소스 텍스트가 모두 비어 있으면 구문 오류이다.
첫 번째 RegularExpressionModifiers 가 매치한 소스 텍스트에 동일한 코드 포인트가 두 번 이상 나타나면 구문 오류이다.
두 번째 RegularExpressionModifiers 가 매치한 소스 텍스트에 동일한 코드 포인트가 두 번 이상 나타나면 구문 오류이다.
첫 번째 RegularExpressionModifiers 가 매치한 소스 텍스트의 임의의 코드 포인트가 두 번째 RegularExpressionModifiers 가 매치한 소스 텍스트에도 포함되면 구문 오류이다.

AtomEscape

GroupName

GroupSpecifiersThatMatch(GroupName) 이 비어 있으면 구문 오류이다.

AtomEscape

DecimalEscape

DecimalEscape 의 CapturingGroupNumber 가 AtomEscape 를 포함하는 Pattern 내의 CountLeftCapturingParensWithin 값보다 엄격히 크면 구문 오류이다.

NonemptyClassRanges

ClassAtom

ClassContents

첫 번째 ClassAtom 의 IsCharacterClass 가 true 이거나 두 번째 ClassAtom 의 IsCharacterClass 가 true 이면 구문 오류이다.
첫 번째 ClassAtom 의 IsCharacterClass 가 false, 두 번째 ClassAtom 의 IsCharacterClass 가 false 이고 첫 번째 ClassAtom 의 CharacterValue 가 두 번째 ClassAtom 의 CharacterValue 보다 엄격히 크면 구문 오류이다.

NonemptyClassRangesNoDash

ClassAtomNoDash

ClassAtom

ClassContents

ClassAtomNoDash 의 IsCharacterClass 가 true 이거나 ClassAtom 의 IsCharacterClass 가 true 이면 구문 오류이다.
ClassAtomNoDash 의 IsCharacterClass 가 false, ClassAtom 의 IsCharacterClass 가 false 이고 ClassAtomNoDash 의 CharacterValue 가 ClassAtom 의 CharacterValue 보다 엄격히 크면 구문 오류이다.

RegExpIdentifierStart

RegExpUnicodeEscapeSequence

RegExpUnicodeEscapeSequence 의 CharacterValue 가 IdentifierStartChar 어휘 문법 생성 규칙에 의해 매치되는 어떤 코드 포인트의 수치 값이 아니면 구문 오류이다.

RegExpIdentifierStart

UnicodeLeadSurrogate

UnicodeTrailSurrogate

RegExpIdentifierStart 의 RegExpIdentifierCodePoint 가 UnicodeIDStart 어휘 문법 생성 규칙에 의해 매치되지 않으면 구문 오류이다.

RegExpIdentifierPart

RegExpUnicodeEscapeSequence

RegExpUnicodeEscapeSequence 의 CharacterValue 가 IdentifierPartChar 어휘 문법 생성 규칙에 의해 매치되는 어떤 코드 포인트의 수치 값이 아니면 구문 오류이다.

RegExpIdentifierPart

UnicodeLeadSurrogate

UnicodeTrailSurrogate

RegExpIdentifierPart 의 RegExpIdentifierCodePoint 가 UnicodeIDContinue 어휘 문법 생성 규칙에 의해 매치되지 않으면 구문 오류이다.

UnicodePropertyValueExpression

UnicodePropertyName

UnicodePropertyValue

UnicodePropertyName에 의해 매치된 소스 텍스트가 Unicode property name 또는 Table 67의 “Property name and aliases” 열에 나열된 속성 별칭이 아닌 경우, 이는 구문 오류입니다.
UnicodePropertyValue에 의해 매치된 소스 텍스트가 UnicodePropertyName에 의해 제공된 유니코드 속성 또는 속성 별칭에 대한 속성 값이나 속성 값 별칭이 아니고, PropertyValueAliases.txt에 나열되어 있지 않은 경우, 이는 구문 오류입니다.

UnicodePropertyValueExpression

LoneUnicodePropertyNameOrValue

LoneUnicodePropertyNameOrValue에 의해 매치된 소스 텍스트가 General_Category (gc) 속성의 유니코드 속성 값 또는 속성 값 별칭이 아니거나 PropertyValueAliases.txt에 나열되어 있지 않고, “Property name and aliases” 열의 Table 68에 나열된 바이너리 속성 또는 바이너리 속성 별칭도 아니며, “Property name” 열의 Table 69에 나열된 문자열의 바이너리 속성도 아닌 경우, 이는 구문 오류입니다.
포함하는 Pattern에 _{[UnicodeSetsMode]} 매개변수가 없고, LoneUnicodePropertyNameOrValue에 의해 매치된 소스 텍스트가 “Property name” 열의 Table 69에 나열된 문자열의 바이너리 속성인 경우, 이는 구문 오류입니다.

CharacterClassEscape

UnicodePropertyValueExpression

}

UnicodePropertyValueExpression 의 MayContainStrings 가 true 이면 구문 오류이다.

CharacterClass

ClassContents

]

ClassContents 의 MayContainStrings 가 true 이면 구문 오류이다.

NestedClass

ClassContents

]

ClassContents 의 MayContainStrings 가 true 이면 구문 오류이다.

ClassSetRange

ClassSetCharacter

첫 번째 ClassSetCharacter 의 CharacterValue 가 두 번째 ClassSetCharacter 의 CharacterValue 보다 엄격히 크면 구문 오류이다.

22.2.1.2 정적 의미론: CountLeftCapturingParensWithin ( `node`: 구문 노드, ): 음이 아닌 정수

The abstract operation UNKNOWN takes UNPARSEABLE ARGUMENTS. node 에 있는 좌측 캡처 괄호(left-capturing parentheses)의 개수를 반환한다. 좌측 캡처 괄호 는 Atom :: ( GroupSpecifieropt Disjunction ) 생성 규칙의 ( 단말 기호에 의해 매치되는 모든 ( 패턴 문자이다.

Note

이 절은 B.1.2.2 에서 수정된다.

It performs the following steps when called:

Assert: node 는 RegExp 패턴 문법 의 어떤 생성 규칙 인스턴스이다.
node 에 포함된 Atom :: ( GroupSpecifieropt Disjunction ) 구문 노드들의 개수를 반환한다.

22.2.1.3 정적 의미론: CountLeftCapturingParensBefore ( `node`: 구문 노드, ): 음이 아닌 정수

The abstract operation UNKNOWN takes UNPARSEABLE ARGUMENTS. 포함하는 패턴 내에서 node 의 왼쪽에 나타나는 좌측 캡처 괄호의 개수를 반환한다.

Note

이 절은 B.1.2.2 에서 수정된다.

It performs the following steps when called:

Assert: node 는 RegExp 패턴 문법 의 어떤 생성 규칙 인스턴스이다.
pattern 을 node 를 포함하는 Pattern 으로 둔다.
pattern 내에서 node 앞에 나타나거나 node 를 포함하는 Atom :: ( GroupSpecifieropt Disjunction ) 구문 노드의 개수를 반환한다.

22.2.1.4 정적 의미론: MightBothParticipate ( `x`: 구문 노드, `y`: 구문 노드, ): 불리언

The abstract operation UNKNOWN takes UNPARSEABLE ARGUMENTS. It performs the following steps when called:

Assert: x 와 y 는 동일한 둘러싼 Pattern 을 가진다.
둘러싼 Pattern 이 Disjunction :: Alternative | Disjunction 구문 노드를 포함하고 x 가 Alternative 내에, y 가 파생된 Disjunction 내에 있거나 그 반대라면 false 를 반환한다.
true 를 반환한다.

22.2.1.5 정적 의미론: CapturingGroupNumber : 양의 정수

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS.

Note

이 절은 B.1.2.1 에서 수정된다.

It is defined piecewise over the following productions:

DecimalEscape

NonZeroDigit

NonZeroDigit 의 MV 를 반환한다.

DecimalEscape

NonZeroDigit

DecimalDigits

n 을 DecimalDigits 의 코드 포인트 개수로 둔다.
(NonZeroDigit 의 MV × 10ⁿ + DecimalDigits 의 MV) 를 반환한다.

“NonZeroDigit 의 MV”, “DecimalDigits 의 MV” 정의는 12.9.3 에 있다.

22.2.1.6 정적 의미론: IsCharacterClass : 불리언

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS.

Note

이 절은 B.1.2.3 에서 수정된다.

It is defined piecewise over the following productions:

ClassAtom

ClassAtomNoDash

SourceCharacter

but not one of \ or ] or -

ClassEscape

CharacterEscape

false 를 반환한다.

ClassEscape

CharacterClassEscape

true 를 반환한다.

22.2.1.7 정적 의미론: CharacterValue : 음이 아닌 정수

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS.

Note 1

이 절은 B.1.2.4 에서 수정된다.

It is defined piecewise over the following productions:

ClassAtom

U+002D (HYPHEN-MINUS) 의 수치 값을 반환한다.

ClassAtomNoDash

SourceCharacter

but not one of \ or ] or -

ch 를 SourceCharacter 가 매치한 코드 포인트로 둔다.
ch 의 수치 값을 반환한다.

ClassEscape

U+0008 (BACKSPACE) 의 수치 값을 반환한다.

ClassEscape

U+002D (HYPHEN-MINUS) 의 수치 값을 반환한다.

CharacterEscape

ControlEscape

Table 65 에 따라 수치 값을 반환한다.

Table 65: ControlEscape 코드 포인트 값

ControlEscape	수치 값	코드 포인트	유니코드 이름	기호
`t`	9	`U+0009`	CHARACTER TABULATION	<HT>
`n`	10	`U+000A`	LINE FEED (LF)	<LF>
`v`	11	`U+000B`	LINE TABULATION	<VT>
`f`	12	`U+000C`	FORM FEED (FF)	<FF>
`r`	13	`U+000D`	CARRIAGE RETURN (CR)	<CR>

CharacterEscape

AsciiLetter

ch 를 AsciiLetter 가 매치한 코드 포인트로 둔다.
i 를 ch 의 수치 값으로 둔다.
i 를 32 로 나눈 나머지를 반환한다.

CharacterEscape

[lookahead ∉ DecimalDigit]

U+0000 (NULL) 의 수치 값을 반환한다.

Note 2

\0 은 <NUL> 문자를 나타내며 10진 숫자가 뒤따를 수 없다.

CharacterEscape

HexEscapeSequence

HexEscapeSequence 의 MV 를 반환한다.

RegExpUnicodeEscapeSequence

HexLeadSurrogate

HexTrailSurrogate

lead 를 HexLeadSurrogate 의 CharacterValue 로 둔다.
trail 을 HexTrailSurrogate 의 CharacterValue 로 둔다.
cp 를 UTF16SurrogatePairToCodePoint(lead, trail) 로 둔다.
cp 의 수치 값을 반환한다.

RegExpUnicodeEscapeSequence

Hex4Digits

Hex4Digits 의 MV 를 반환한다.

RegExpUnicodeEscapeSequence

CodePoint

}

CodePoint 의 MV 를 반환한다.

Hex4Digits 의 MV 를 반환한다.

CharacterEscape

IdentityEscape

ch 를 IdentityEscape 가 매치한 코드 포인트로 둔다.
ch 의 수치 값을 반환한다.

ClassSetCharacter

SourceCharacter

but not ClassSetSyntaxCharacter

ch 를 SourceCharacter 가 매치한 코드 포인트로 둔다.
ch 의 수치 값을 반환한다.

ClassSetCharacter

ClassSetReservedPunctuator

ch 를 ClassSetReservedPunctuator 가 매치한 코드 포인트로 둔다.
ch 의 수치 값을 반환한다.

ClassSetCharacter

U+0008 (BACKSPACE) 의 수치 값을 반환한다.

22.2.1.8 정적 의미론: MayContainStrings : 불리언

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS. It is defined piecewise over the following productions:

CharacterClassEscape

UnicodePropertyValueExpression

}

UnicodePropertyValueExpression

]

[empty]

false 를 반환한다.

UnicodePropertyValueExpression

LoneUnicodePropertyNameOrValue

LoneUnicodePropertyNameOrValue에 의해 매치된 소스 텍스트가 Table 69의 “Property name” 열에 나열된 문자열의 바이너리 속성인 경우, true를 반환한다.
false를 반환한다.

ClassUnion

ClassSetRange

ClassUnion

opt

ClassUnion 이 존재하면 그 ClassUnion 의 MayContainStrings 를 반환한다.
false 를 반환한다.

ClassUnion

ClassSetOperand

ClassUnion

opt

ClassSetOperand 의 MayContainStrings 가 true 이면 true 를 반환한다.
ClassUnion 이 존재하면 그 ClassUnion 의 MayContainStrings 를 반환한다.
false 를 반환한다.

ClassIntersection

ClassSetOperand

첫 번째 ClassSetOperand 의 MayContainStrings 가 false 이면 false 를 반환한다.
두 번째 ClassSetOperand 의 MayContainStrings 가 false 이면 false 를 반환한다.
true 를 반환한다.

ClassIntersection

ClassSetOperand

ClassIntersection 의 MayContainStrings 가 false 이면 false 를 반환한다.
ClassSetOperand 의 MayContainStrings 가 false 이면 false 를 반환한다.
true 를 반환한다.

ClassSubtraction

ClassSetOperand

첫 번째 ClassSetOperand 의 MayContainStrings 를 반환한다.

ClassSubtraction

ClassSetOperand

ClassSubtraction 의 MayContainStrings 를 반환한다.

ClassStringDisjunctionContents

ClassString

ClassStringDisjunctionContents

ClassString 의 MayContainStrings 가 true 이면 true 를 반환한다.
ClassStringDisjunctionContents 의 MayContainStrings 를 반환한다.

ClassString

[empty]

true 를 반환한다.

ClassString

NonEmptyClassString

NonEmptyClassString 의 MayContainStrings 를 반환한다.

NonEmptyClassString

ClassSetCharacter

NonEmptyClassString

opt

NonEmptyClassString 이 존재하면 true 를 반환한다.
false 를 반환한다.

22.2.1.9 정적 의미론: GroupSpecifiersThatMatch ( `thisGroupName`: GroupName 구문 노드, ): GroupSpecifier 구문 노드들의 리스트

The abstract operation UNKNOWN takes UNPARSEABLE ARGUMENTS. It performs the following steps when called:

name 을 thisGroupName 의 CapturingGroupName 으로 둔다.
pattern 을 thisGroupName 을 포함하는 Pattern 으로 둔다.
result 를 새 빈 리스트로 둔다.
pattern 이 포함하는 각 GroupSpecifier gs 에 대해,
1. gs 의 CapturingGroupName 이 name 과 같다면
  1. gs 를 result 에 추가한다.
result 를 반환한다.

22.2.1.10 정적 의미론: CapturingGroupName : 문자열

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS. It is defined piecewise over the following productions:

GroupName

RegExpIdentifierName

idTextUnescaped 를 RegExpIdentifierName 의 RegExpIdentifierCodePoints 로 둔다.
CodePointsToString(idTextUnescaped) 을 반환한다.

22.2.1.11 정적 의미론: RegExpIdentifierCodePoints : 코드 포인트들의 리스트

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS. It is defined piecewise over the following productions:

RegExpIdentifierName

RegExpIdentifierStart

cp 를 RegExpIdentifierStart 의 RegExpIdentifierCodePoint 로 둔다.
« cp » 를 반환한다.

RegExpIdentifierName

RegExpIdentifierPart

cps 를 파생된 RegExpIdentifierName 의 RegExpIdentifierCodePoints 로 둔다.
cp 를 RegExpIdentifierPart 의 RegExpIdentifierCodePoint 로 둔다.
cps 와 « cp » 의 리스트 연결(list-concatenation)을 반환한다.

22.2.1.12 정적 의미론: RegExpIdentifierCodePoint : 코드 포인트

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS. It is defined piecewise over the following productions:

RegExpIdentifierStart

IdentifierStartChar

IdentifierStartChar 가 매치한 코드 포인트를 반환한다.

RegExpIdentifierPart

IdentifierPartChar

IdentifierPartChar 가 매치한 코드 포인트를 반환한다.

RegExpIdentifierStart

RegExpUnicodeEscapeSequence

RegExpIdentifierPart

RegExpUnicodeEscapeSequence

수치 값이 RegExpUnicodeEscapeSequence 의 CharacterValue 인 코드 포인트를 반환한다.

RegExpIdentifierStart

UnicodeLeadSurrogate

UnicodeTrailSurrogate

RegExpIdentifierPart

UnicodeLeadSurrogate

UnicodeTrailSurrogate

lead 를 UnicodeLeadSurrogate 가 매치한 코드 포인트의 수치 값과 동일한 수치 값을 가진 코드 유닛으로 둔다.
trail 을 UnicodeTrailSurrogate 가 매치한 코드 포인트의 수치 값과 동일한 수치 값을 가진 코드 유닛으로 둔다.
UTF16SurrogatePairToCodePoint(lead, trail) 을 반환한다.

22.2.2 패턴 의미론 (Pattern Semantics)

정규 표현식 패턴은 아래에 서술된 과정을 통해 추상 클로저(Abstract Closure)로 변환된다. 구현체는 결과가 동일한 한, 아래에 열거된 것보다 더 효율적인 알고리즘을 사용하는 것이 권장된다. 이렇게 생성된 추상 클로저는 RegExp 객체의 [[RegExpMatcher]] 내부 슬롯 값으로 사용된다.

Pattern 은 관련 플래그에 u 와 v 둘 다 포함되지 않으면 BMP 패턴이고, 그렇지 않으면 유니코드 패턴이다. BMP 패턴은 입력 문자열을 BMP(기본 다국어 평면) 범위의 유니코드 코드 포인트들로 구성된 16비트 값 시퀀스로 해석하여 매칭한다. 유니코드 패턴은 입력 문자열을 UTF-16 으로 인코딩된 유니코드 코드 포인트 시퀀스로 해석하여 매칭한다. BMP 패턴의 동작을 설명하는 문맥에서 “문자(character)”는 단일 16비트 유니코드 BMP 코드 포인트를 의미한다. 유니코드 패턴의 동작을 설명하는 문맥에서 “문자”는 UTF-16 으로 인코딩된 하나의 코드 포인트를 의미한다 (6.1.4). 두 문맥 모두에서 “문자 값(character value)”은 해당 (인코딩되지 않은) 코드 포인트의 수치 값을 의미한다.

Pattern 의 구문 및 의미는 Pattern 의 소스 텍스트가 각 SourceCharacter 가 하나의 유니코드 코드 포인트에 대응하는 SourceCharacter 값들의 리스트인 것처럼 정의된다. 만약 BMP 패턴이 BMP가 아닌 SourceCharacter 를 포함한다면 전체 패턴은 UTF-16 으로 인코딩되고, 그 인코딩을 이루는 개별 코드 유닛들이 리스트의 요소로 사용된다.

Note

예를 들어, 소스 텍스트에서 하나의 비-BMP 문자 U+1D11E (MUSICAL SYMBOL G CLEF) 로 표현된 패턴을 생각하자. 유니코드 패턴으로 해석하면 단일 코드 포인트 U+1D11E 를 포함하는 (문자 1개짜리) 단일 요소 리스트가 된다. 그러나 BMP 패턴으로 해석하면 먼저 UTF-16 으로 인코딩되어 코드 유닛 0xD834 와 0xDD1E 두 요소로 이루어진 리스트가 된다.

패턴은 RegExp 생성자에 비-BMP 문자가 UTF-16 으로 인코딩된 ECMAScript 문자열 값으로 전달된다. 예를 들어, 단일 문자 MUSICAL SYMBOL G CLEF 패턴은 길이 2의 문자열로서 그 요소가 코드 유닛 0xD834 와 0xDD1E 이다. 따라서 이를 두 개의 패턴 문자로 이루어진 BMP 패턴으로 처리하기 위해 추가 변환이 필요하지 않다. 그러나 이를 유니코드 패턴으로 처리하려면 UTF16SurrogatePairToCodePoint 를 사용하여 하나의 패턴 문자(코드 포인트 U+1D11E)만을 요소로 갖는 리스트를 생성해야 한다.

구현체는 실제로 UTF-16 으로의 변환이나 역변환을 수행하지 않을 수 있으나, 이 명세의 의미론은 패턴 매칭 결과가 마치 그러한 변환이 수행된 것과 동일해야 한다고 요구한다.

22.2.2.1 표기법 (Notation)

아래 설명에서는 다음의 내부 데이터 구조를 사용한다:

CharSetElement 는 다음 두 가지 중 하나이다:
- rer.[[UnicodeSets]] 가 false 이면 CharSetElement 는 위 패턴 의미론의 문맥에서 “문자” 하나이다.
- rer.[[UnicodeSets]] 가 true 이면 CharSetElement 는 위 패턴 의미론의 문맥에서 “문자”들로 이루어진 시퀀스이다. 여기에는 빈 시퀀스, 한 문자 시퀀스, 두 문자 이상 시퀀스가 모두 포함된다. 편의를 위해 이런 종류의 CharSetElement 를 다룰 때 개별 문자는 한 문자로 이루어진 시퀀스와 상호 교환적으로 취급된다.
CharSet 은 CharSetElement 들의 수학적 집합이다.
CaptureRange 는 { [[StartIndex]], [[EndIndex]] } 형태의 레코드로, 캡처에 포함된 문자 범위를 나타낸다. [[StartIndex]] 는 Input 내에서 범위 시작(포함)의 정수 인덱스, [[EndIndex]] 는 범위 끝(배타)의 정수 인덱스이다. 모든 CaptureRange 에 대해 [[StartIndex]] ≤ [[EndIndex]] 조건(불변식)을 만족해야 한다.
MatchState 는 { [[Input]], [[EndIndex]], [[Captures]] } 형태의 레코드이며, [[Input]] 은 매칭 중인 문자열을 나타내는 문자들의 리스트, [[EndIndex]] 는 정수, [[Captures]] 는 패턴 안의 각 좌측 캡처 괄호에 대응하는 값들의 리스트이다. MatchState 는 정규 표현식 매칭 알고리즘에서 부분 매칭 상태를 표현한다. [[EndIndex]] 는 지금까지 패턴이 매치한 마지막 입력 문자의 인덱스보다 1 큰 값이며, [[Captures]] 는 캡처 괄호의 결과를 담는다. [[Captures]] 의 n^번째 요소는 n^번째 캡처 괄호가 캡처한 문자 범위를 나타내는 CaptureRange 이거나, 해당 캡처 괄호에 아직 도달하지 않았다면 undefined 이다. 백트래킹 때문에 매칭 과정 중 언제든 다수의 MatchState 가 사용될 수 있다.
MatcherContinuation 은 하나의 MatchState 인수를 받아 MatchState 또는 failure 를 반환하는 추상 클로저이다. MatcherContinuation 은 (클로저의 캡처된 값들에 의해 지정되는) 패턴의 남은 부분을 Input 에 대해, 인수로 전달된 MatchState 가 나타내는 중간 상태부터 매칭하려 시도한다. 매칭이 성공하면 도달한 최종 MatchState 를 반환하고, 실패하면 failure 를 반환한다.
Matcher 는 두 인수(하나는 MatchState, 다른 하나는 MatcherContinuation)를 받아 MatchState 또는 failure 를 반환하는 추상 클로저이다. Matcher 는 (클로저의 캡처된 값들에 의해 지정되는) 패턴의 중간 부분(subpattern)을 MatchState 의 [[Input]] 에 대해, 인수 MatchState 가 나타내는 중간 상태부터 매칭하려 시도한다. MatcherContinuation 인수는 패턴의 나머지를 매칭하는 클로저여야 한다. 서브패턴을 매칭해 새 MatchState 를 얻은 뒤, Matcher 는 그 새 MatchState 로 MatcherContinuation 을 호출하여 패턴의 나머지 부분도 매칭 가능한지 시험한다. 가능하면 Matcher 는 MatcherContinuation 이 반환한 MatchState 를 반환하며, 그렇지 않으면 가능한 선택 지점(choice point)에서 다른 선택을 시도하고, 성공하거나 모든 가능성이 소진될 때까지 MatcherContinuation 을 반복 호출할 수 있다.

22.2.2.1.1 RegExp 레코드 (RegExp Records)

RegExp Record 는 컴파일 과정 및 (필요하다면) 매칭 과정에서 필요한 RegExp 관련 정보를 저장하기 위해 사용되는 레코드 값이다.

다음 필드들을 가진다:

Table 66: RegExp Record 필드

필드 이름	값	의미
`[[IgnoreCase]]`	Boolean	RegExp 플래그에 "i" 가 나타나는지 여부
`[[Multiline]]`	Boolean	RegExp 플래그에 "m" 가 나타나는지 여부
`[[DotAll]]`	Boolean	RegExp 플래그에 "s" 가 나타나는지 여부
`[[Unicode]]`	Boolean	RegExp 플래그에 "u" 가 나타나는지 여부
`[[UnicodeSets]]`	Boolean	RegExp 플래그에 "v" 가 나타나는지 여부
`[[CapturingGroupsCount]]`	음이 아닌 정수	패턴 내 좌측 캡처 괄호의 개수

22.2.2.2 런타임 의미론: CompilePattern : 문자 리스트와 음이 아닌 정수를 받아 MatchState 또는 failure 를 반환하는 추상 클로저

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS. It is defined piecewise over the following productions:

Pattern

Disjunction

m 을 Disjunction 에 대해 인수 rer, forward 로 CompileSubpattern 한 결과로 둔다.
rer 와 m 을 캡처하고, 매개변수 (Input, index) 를 가지며 호출 시 다음 단계를 수행하는 새 추상 클로저를 반환한다:
1. Assert: Input 은 문자들의 리스트이다.
2. Assert: 0 ≤ index ≤ Input 의 요소 개수.
3. c 를 매개변수 (y) 를 가지며 아무것도 캡처하지 않고 호출 시 다음 단계를 수행하는 새 MatcherContinuation 으로 둔다:
  1. Assert: y 는 MatchState.
  2. y 를 반환한다.
4. cap 을 rer.[[CapturingGroupsCount]] 개의 undefined 값(인덱스 1..rer.[[CapturingGroupsCount]])을 가진 리스트로 둔다.
5. x 를 MatchState { [[Input]]: Input, [[EndIndex]]: index, [[Captures]]: cap } 로 둔다.
6. m(x, c) 를 반환한다.

Note

Pattern 은 추상 클로저 값으로 컴파일된다. RegExpBuiltinExec 는 그런 절차를 문자 리스트와 그 안의 오프셋에 적용하여, 패턴이 정확히 그 오프셋에서 매칭되는지, 그리고 매칭된다면 캡처 괄호 값이 무엇인지 결정할 수 있다. 22.2.2 의 알고리즘들은 패턴 컴파일이 SyntaxError 예외를 던질 수 있도록 설계되어 있다; 반면 패턴이 성공적으로 컴파일된 후에는 (메모리 부족 등 구현 정의 예외를 제외하고) 결과 추상 클로저를 적용하여 문자 리스트에서 매칭을 찾는 동작이 예외를 던지지 않는다.

22.2.2.3 런타임 의미론: CompileSubpattern : Matcher

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS.

Note 1

이 절은 B.1.2.5 에서 수정된다.

It is defined piecewise over the following productions:

Disjunction

Alternative

Disjunction

m1 을 Alternative 에 대해 인수 rer, direction 으로 CompileSubpattern 한 결과로 둔다.
m2 를 Disjunction 에 대해 인수 rer, direction 으로 CompileSubpattern 한 결과로 둔다.
MatchTwoAlternatives(m1, m2) 를 반환한다.

Note 2

| 연산자는 두 개의 대안을 구분한다. 패턴은 먼저 왼쪽 Alternative (및 뒤따르는 정규 표현식 나머지)를 매칭 시도하고, 실패하면 오른쪽 Disjunction (및 나머지)를 매칭 시도한다. 왼쪽 Alternative, 오른쪽 Disjunction, 그리고 나머지가 모두 선택 지점(choice point)을 가진다면, 왼쪽 Alternative 의 다음 선택으로 진행하기 전에 나머지의 모든 선택을 시도한다. 왼쪽 Alternative 의 선택이 다 소진되면 왼쪽 대신 오른쪽 Disjunction 을 시도한다. | 로 건너뛰어진 패턴 부분 내의 캡처 괄호는 문자열이 아니라 undefined 값을 생성한다. 예:

/a|ab/.exec("abc")

는 결과 "a" 를 반환하며 "ab" 가 아니다. 또한,

/((a)|(ab))((c)|(bc))/.exec("abc")

는 배열

["abc", "a", "a", undefined, "bc", undefined, "bc"]

를 반환하고,

["abc", "ab", undefined, "ab", "c", "c", undefined]

는 아니다. 두 대안을 시도하는 순서는 direction 값과 무관하다.

Alternative

[empty]

EmptyMatcher() 를 반환한다.

Alternative

Term

m1 을 Alternative 에 대해 인수 rer, direction 으로 CompileSubpattern 한 결과로 둔다.
m2 를 Term 에 대해 인수 rer, direction 으로 CompileSubpattern 한 결과로 둔다.
MatchSequence(m1, m2, direction) 를 반환한다.

Note 3

연속된 Term 들은 Input 의 연속된 부분들을 동시에 매칭하려 시도한다. direction 이 forward 일 때, 왼쪽 Alternative, 오른쪽 Term, 그리고 나머지에 모두 선택 지점이 있다면 오른쪽 Term 의 다음 선택으로 가기 전에 나머지의 모든 선택을 시도하고, 왼쪽 Alternative 의 다음 선택으로 가기 전에 오른쪽 Term 의 모든 선택을 시도한다. direction 이 backward 이면 Alternative 와 Term 의 평가 순서가 반전된다.

Term

Assertion

Assertion 에 대해 인수 rer 로 CompileAssertion 한 결과를 반환한다.

Note 4

결과 Matcher 는 direction 과 무관하다.

Term

Atom

Atom 에 대해 인수 rer, direction 으로 CompileAtom 한 결과를 반환한다.

Term

Atom

Quantifier

m 을 Atom 에 대해 인수 rer, direction 으로 CompileAtom 한 결과로 둔다.
q 를 Quantifier 에 대해 CompileQuantifier 한 결과로 둔다.
Assert: q.[[Min]] ≤ q.[[Max]].
parenIndex 를 CountLeftCapturingParensBefore(Term) 로 둔다.
parenCount 를 CountLeftCapturingParensWithin(Atom) 으로 둔다.
m, q, parenIndex, parenCount 를 캡처하고 매개변수 (x, c) 를 가지며 호출 시 다음을 수행하는 새 Matcher 를 반환한다:
1. Assert: x 는 MatchState.
2. Assert: c 는 MatcherContinuation.
3. RepeatMatcher(m, q.[[Min]], q.[[Max]], q.[[Greedy]], x, c, parenIndex, parenCount) 를 반환한다.

22.2.2.3.1 RepeatMatcher ( `m`, `min`, `max`, `greedy`, `x`, `c`, `parenIndex`, `parenCount` )

The abstract operation RepeatMatcher takes arguments m (Matcher), min (음이 아닌 정수), max (음이 아닌 정수 또는 +∞), greedy (Boolean), x (MatchState), c (MatcherContinuation), parenIndex (음이 아닌 정수), and parenCount (음이 아닌 정수) and returns MatchState 또는 failure. It performs the following steps when called:

max = 0 이면 c(x) 를 반환한다.
d 를 매개변수 (y) 를 가지며 m, min, max, greedy, x, c, parenIndex, parenCount 를 캡처하고 호출 시 다음을 수행하는 새 MatcherContinuation 으로 둔다:
1. Assert: y 는 MatchState.
2. min = 0 이고 y.[[EndIndex]] = x.[[EndIndex]] 이면 failure 를 반환한다.
3. min = 0 이면 min2 = 0, 그렇지 않으면 min2 = min - 1 로 둔다.
4. max = +∞ 이면 max2 = +∞, 아니면 max2 = max - 1 로 둔다.
5. RepeatMatcher(m, min2, max2, greedy, y, c, parenIndex, parenCount) 를 반환한다.
cap 을 x.[[Captures]] 의 복사본으로 둔다.
parenIndex + 1 부터 parenIndex + parenCount 까지 (포함) 정수 k 각각에 대해 cap[k] = undefined 로 설정한다.
Input 을 x.[[Input]] 으로 둔다.
e 를 x.[[EndIndex]] 로 둔다.
xr 을 MatchState { [[Input]]: Input, [[EndIndex]]: e, [[Captures]]: cap } 로 둔다.
min ≠ 0 이면 m(xr, d) 를 반환한다.
greedy 가 false 이면
1. z 를 c(x) 로 둔다.
2. z 가 failure 가 아니면 z 반환.
3. m(xr, d) 를 반환한다.
z 를 m(xr, d) 로 둔다.
z 가 failure 가 아니면 z 반환.
c(x) 를 반환한다.

Note 1

Atom 뒤에 Quantifier 가 오면 Quantifier 가 지정한 횟수만큼 Atom 이 반복된다. Quantifier 는 비탐욕적(non-greedy)일 수 있으며, 이 경우 Atom 패턴은 나머지를 매칭 가능하게 유지하는 최소 횟수만 반복된다. 탐욕적(greedy)이면 가능한 한 많이 반복된다. 반복되는 것은 입력 문자 시퀀스가 아니라 Atom 패턴이므로 각 반복은 서로 다른 입력 부분 문자열을 매칭할 수 있다.

Note 2

Atom 과 정규 표현식의 나머지가 모두 선택 지점을 가진다면, Atom 은 먼저 가능한 많이(또는 비탐욕적이면 가능한 적게) 매칭된다. 나머지 부분의 모든 선택이 소진되기 전까지 Atom 의 마지막 반복의 다음 선택으로 이동하지 않는다. 마지막 (n^번째) 반복의 모든 선택이 소진되기 전까지 (n - 1)^번째 반복의 다음 선택으로 이동하지 않는다. 그 시점에서 Atom 을 더 많이 또는 더 적게 반복할 수 있게 될 수도 있으며, (다시 최소 또는 최대부터 시작하여) 이것들을 소진한 뒤 (n - 1)^번째 반복의 다음 선택으로 이동한다.

비교:

/a[a-z]{2,4}/.exec("abcdefghi")

는 "abcde" 를 반환하고,

/a[a-z]{2,4}?/.exec("abcdefghi")

는 "abc" 를 반환한다.

또한,

/(aa|aabaac|ba|b|c)*/.exec("aabaac")

는 위 선택 지점 순서에 따라 배열

["aaba", "ba"]

를 반환하며, 아래 어느 것도 아니다:

["aabaac", "aabaac"]
["aabaac", "c"]

위 선택 지점 순서는 (단항 표기) 두 수의 최대공약수를 계산하는 정규 표현식을 작성하는 데 사용될 수 있다. 다음 예는 10 과 15 의 gcd 를 계산한다:

"aaaaaaaaaa,aaaaaaaaaaaaaaa".replace(/^(a+)\1*,\1+$/, "$1")

이는 단항 표기 gcd "aaaaa" 를 반환한다.

Note 3

RepeatMatcher 의 단계 4 는 Atom 이 반복될 때마다 해당 Atom 의 캡처를 지운다. 이는 다음 정규 표현식을 통해 확인할 수 있다:

/(z)((a+)?(b+)?(c))*/.exec("zaacbbbcac")

이 표현식은 배열

["zaacbbbcac", "z", "ac", "a", undefined, "c"]

를 반환하고,

["zaacbbbcac", "z", "ac", "a", "bbb", "c"]

를 반환하지 않는다. 이는 가장 바깥 * 의 각 반복이 수량자 적용 대상 Atom 에 포함된 모든 캡처 문자열(여기서는 2, 3, 4, 5 번 캡처)을 비우기 때문이다.

Note 4

RepeatMatcher 의 단계 2.b 는 최소 반복 횟수를 만족한 후, 빈 문자 시퀀스를 매칭하는 Atom 의 추가 확장은 더 이상 고려하지 않는다고 명시한다. 이는 다음과 같은 패턴에서 엔진이 무한 루프에 빠지는 것을 방지한다:

/(a*)*/.exec("b")

혹은 조금 더 복잡한 패턴:

/(a*)b\1+/.exec("baaaac")

이는 배열

["b", ""]

를 반환한다.

22.2.2.3.2 EmptyMatcher ( )

The abstract operation EmptyMatcher takes no arguments and returns Matcher. It performs the following steps when called:

아무것도 캡처하지 않고 매개변수 (x, c) 를 가지며 호출 시 다음을 수행하는 새 Matcher 를 반환한다:
1. Assert: x 는 MatchState.
2. Assert: c 는 MatcherContinuation.
3. c(x) 를 반환한다.

22.2.2.3.3 MatchTwoAlternatives ( `m1`, `m2` )

The abstract operation MatchTwoAlternatives takes arguments m1 (Matcher) and m2 (Matcher) and returns Matcher. It performs the following steps when called:

m1, m2 를 캡처하고 매개변수 (x, c) 를 가지며 호출 시 다음을 수행하는 새 Matcher 를 반환한다:
1. Assert: x 는 MatchState.
2. Assert: c 는 MatcherContinuation.
3. r 를 m1(x, c) 로 둔다.
4. r 가 failure 가 아니면 r 반환.
5. m2(x, c) 를 반환한다.

22.2.2.3.4 MatchSequence ( `m1`, `m2`, `direction` )

The abstract operation MatchSequence takes arguments m1 (Matcher), m2 (Matcher), and direction (forward 또는 backward) and returns Matcher. It performs the following steps when called:

direction 이 forward 이면
1. m1, m2 를 캡처하고 매개변수 (x, c) 를 가지며 호출 시 다음을 수행하는 새 Matcher 를 반환한다:
  1. Assert: x 는 MatchState.
  2. Assert: c 는 MatcherContinuation.
  3. d 를 매개변수 (y) 를 가지며 c, m2 를 캡처하고 호출 시 다음을 수행하는 새 MatcherContinuation 으로 둔다:
    1. Assert: y 는 MatchState.
    2. m2(y, c) 를 반환한다.
  4. m1(x, d) 를 반환한다.
그렇지 않으면,
1. Assert: direction 은 backward.
2. m1, m2 를 캡처하고 매개변수 (x, c) 를 가지며 호출 시 다음을 수행하는 새 Matcher 를 반환한다:
  1. Assert: x 는 MatchState.
  2. Assert: c 는 MatcherContinuation.
  3. d 를 매개변수 (y) 를 가지며 c, m1 을 캡처하고 호출 시 다음을 수행하는 새 MatcherContinuation 으로 둔다:
    1. Assert: y 는 MatchState.
    2. m1(y, c) 를 반환한다.
  4. m2(x, d) 를 반환한다.

22.2.2.4 런타임 의미론: CompileAssertion : Matcher

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS.

Note 1

이 절은 B.1.2.6 에서 수정된다.

It is defined piecewise over the following productions:

Assertion

rer 를 캡처하고 매개변수 (x, c) 를 가지며 호출 시 다음을 수행하는 새 Matcher 를 반환한다:
1. Assert: x 는 MatchState.
2. Assert: c 는 MatcherContinuation.
3. Input 을 x.[[Input]] 으로 둔다.
4. e 를 x.[[EndIndex]] 로 둔다.
5. e = 0 이거나 (rer.[[Multiline]] 이 true 이고 문자 Input[e - 1] 이 LineTerminator 에 매치되면)
  1. c(x) 를 반환한다.
6. failure 를 반환한다.

Note 2

y 플래그를 사용해도 ^ 는 항상 Input 의 시작(또는 rer.[[Multiline]] 이 true 이면 줄의 시작)에서만 매치된다.

Assertion

rer 를 캡처하고 매개변수 (x, c) 를 가지며 호출 시 다음을 수행하는 새 Matcher 를 반환한다:
1. Assert: x 는 MatchState.
2. Assert: c 는 MatcherContinuation.
3. Input 을 x.[[Input]] 으로 둔다.
4. e 를 x.[[EndIndex]] 로 둔다.
5. InputLength 를 Input 의 요소 개수로 둔다.
6. e = InputLength 이거나 (rer.[[Multiline]] 이 true 이고 문자 Input[e] 가 LineTerminator 에 매치되면)
  1. c(x) 를 반환한다.
7. failure 를 반환한다.

Assertion

rer 를 캡처하고 매개변수 (x, c) 를 가지며 호출 시 다음을 수행하는 새 Matcher 를 반환한다:
1. Assert: x 는 MatchState.
2. Assert: c 는 MatcherContinuation.
3. Input 을 x.[[Input]] 으로 둔다.
4. e 를 x.[[EndIndex]] 로 둔다.
5. a 를 IsWordChar(rer, Input, e - 1) 로 둔다.
6. b 를 IsWordChar(rer, Input, e) 로 둔다.
7. (a 가 true 이고 b 가 false) 또는 (a 가 false 이고 b 가 true) 이면 c(x) 반환.
8. failure 반환.

Assertion

rer 를 캡처하고 매개변수 (x, c) 를 가지며 호출 시 다음을 수행하는 새 Matcher 를 반환한다:
1. Assert: x 는 MatchState.
2. Assert: c 는 MatcherContinuation.
3. Input 을 x.[[Input]] 으로 둔다.
4. e 를 x.[[EndIndex]] 로 둔다.
5. a 를 IsWordChar(rer, Input, e - 1) 로 둔다.
6. b 를 IsWordChar(rer, Input, e) 로 둔다.
7. (a 와 b 가 모두 true) 또는 (a 와 b 가 모두 false) 이면 c(x) 반환.
8. failure 반환.

Assertion

(?=

Disjunction

)

m 을 Disjunction 에 대해 인수 rer, forward 로 CompileSubpattern 한 결과로 둔다.
m 을 캡처하고 매개변수 (x, c) 를 가지며 호출 시 다음을 수행하는 새 Matcher 를 반환한다:
1. Assert: x 는 MatchState.
2. Assert: c 는 MatcherContinuation.
3. d 를 매개변수 (y) 를 가지며 아무것도 캡처하지 않고 호출 시 다음을 수행하는 새 MatcherContinuation 으로 둔다:
  1. Assert: y 는 MatchState.
  2. y 반환.
4. r 를 m(x, d) 로 둔다.
5. r 가 failure 이면 failure 반환.
6. Assert: r 는 MatchState.
7. cap 을 r.[[Captures]] 로 둔다.
8. Input 을 x.[[Input]] 으로 둔다.
9. xe 를 x.[[EndIndex]] 로 둔다.
10. z 를 MatchState { [[Input]]: Input, [[EndIndex]]: xe, [[Captures]]: cap } 로 둔다.
11. c(z) 반환.

Note 3

(?= Disjunction ) 형식은 폭 0 양수 전방 탐색(positive lookahead)을 지정한다. 성공하려면 Disjunction 내부 패턴이 현재 위치에서 매치되어야 하지만, 나머지를 매칭하기 전에 현재 위치는 전진하지 않는다. Disjunction 이 현재 위치에서 여러 방식으로 매치 가능하면 첫 번째 방식만 시도된다. 다른 정규 표현식 연산자와 달리 (?= 형식 안으로의 백트래킹은 없다(이 비정상적 동작은 Perl 에서 유래). 이는 Disjunction 이 캡처 괄호를 포함하고 패턴의 나머지 부분이 그 캡처에 대한 역참조를 포함할 때에만 중요하다.

예:

/(?=(a+))/.exec("baaabac")

는 첫 번째 b 직후 빈 문자열을 매치하므로 배열:

["", "aaa"]

를 반환한다.

전방 탐색 내부로의 백트래킹 부재를 보여주기 위해:

/(?=(a+))a*b\1/.exec("baaabac")

이 표현식은

["aba", "a"]

를 반환하며,

["aaaba", "a"]

는 아니다.

Assertion

(?!

Disjunction

)

m 을 Disjunction 에 대해 인수 rer, forward 로 CompileSubpattern 한 결과로 둔다.
m 을 캡처하고 매개변수 (x, c) 를 가지며 호출 시 다음을 수행하는 새 Matcher 를 반환한다:
1. Assert: x 는 MatchState.
2. Assert: c 는 MatcherContinuation.
3. d 를 매개변수 (y) 를 가지며 아무것도 캡처하지 않고 호출 시 다음을 수행하는 새 MatcherContinuation 으로 둔다:
  1. Assert: y 는 MatchState.
  2. y 반환.
4. r 를 m(x, d) 로 둔다.
5. r 가 failure 가 아니면 failure 반환.
6. c(x) 반환.

Note 4

(?! Disjunction ) 형식은 폭 0 음수 전방 탐색(negative lookahead)을 지정한다. 성공하려면 Disjunction 내부 패턴이 현재 위치에서 매치에 실패해야 한다. 현재 위치는 나머지를 매칭하기 전에 전진하지 않는다. Disjunction 은 캡처 괄호를 포함할 수 있지만, 그 캡처에 대한 역참조는 Disjunction 내부에서만 의미가 있다. 패턴의 다른 부분에서 이 캡처에 대한 역참조는 항상 undefined 를 반환하는데, 음수 전방 탐색은 실패해야 전체 패턴이 성공하기 때문이다. 예:

/(.*?)a(?!(a+)b\2c)\2(.*)/.exec("baaabaac")

이 표현식은 a 뒤에 즉시 n 개의 a, 하나의 b, 다시 n 개의 a(첫 번째 \2 로 지정), 그리고 c 가 오는 패턴이 아닌 경우를 찾는다. 두 번째 \2 는 음수 전방 탐색 밖에 있으므로 undefined 와 매치되어 항상 성공한다. 전체 표현식은 배열:

["baaabaac", "ba", undefined, "abaac"]

을 반환한다.

Assertion

(?<=

Disjunction

)

m 을 Disjunction 에 대해 인수 rer, backward 로 CompileSubpattern 한 결과로 둔다.
m 을 캡처하고 매개변수 (x, c) 를 가지며 호출 시 다음을 수행하는 새 Matcher 를 반환한다:
1. Assert: x 는 MatchState.
2. Assert: c 는 MatcherContinuation.
3. d 를 매개변수 (y) 를 가지며 아무것도 캡처하지 않고 호출 시 다음을 수행하는 새 MatcherContinuation 으로 둔다:
  1. Assert: y 는 MatchState.
  2. y 반환.
4. r 를 m(x, d) 로 둔다.
5. r 가 failure 이면 failure 반환.
6. Assert: r 는 MatchState.
7. cap 을 r.[[Captures]] 로 둔다.
8. Input 을 x.[[Input]] 으로 둔다.
9. xe 를 x.[[EndIndex]] 로 둔다.
10. z 를 MatchState { [[Input]]: Input, [[EndIndex]]: xe, [[Captures]]: cap } 로 둔다.
11. c(z) 반환.

Assertion

(?<!

Disjunction

)

m 을 Disjunction 에 대해 인수 rer, backward 로 CompileSubpattern 한 결과로 둔다.
m 을 캡처하고 매개변수 (x, c) 를 가지며 호출 시 다음을 수행하는 새 Matcher 를 반환한다:
1. Assert: x 는 MatchState.
2. Assert: c 는 MatcherContinuation.
3. d 를 매개변수 (y) 를 가지며 아무것도 캡처하지 않고 호출 시 다음을 수행하는 새 MatcherContinuation 으로 둔다:
  1. Assert: y 는 MatchState.
  2. y 반환.
4. r 를 m(x, d) 로 둔다.
5. r 가 failure 가 아니면 failure 반환.
6. c(x) 반환.

22.2.2.4.1 IsWordChar ( `rer`, `Input`, `e` )

The abstract operation IsWordChar takes arguments rer (RegExp Record), Input (문자 리스트), and e (정수) and returns Boolean. It performs the following steps when called:

InputLength 를 Input 의 요소 개수로 둔다.
e = -1 또는 e = InputLength 이면 false 반환.
c 를 문자 Input[e] 로 둔다.
WordCharacters(rer) 가 c 를 포함하면 true 반환.
false 반환.

22.2.2.5 런타임 의미론: CompileQuantifier 필드를 갖는 레코드

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS. It is defined piecewise over the following productions:

Quantifier

QuantifierPrefix

qp 를 QuantifierPrefix 에 대해 CompileQuantifierPrefix 한 결과로 둔다.
{ [[Min]]: qp.[[Min]], [[Max]]: qp.[[Max]], [[Greedy]]: true } 레코드를 반환한다.

Quantifier

QuantifierPrefix

qp 를 QuantifierPrefix 에 대해 CompileQuantifierPrefix 한 결과로 둔다.
{ [[Min]]: qp.[[Min]], [[Max]]: qp.[[Max]], [[Greedy]]: false } 레코드를 반환한다.

22.2.2.6 런타임 의미론: CompileQuantifierPrefix 필드를 갖는 레코드

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS. It is defined piecewise over the following productions:

QuantifierPrefix

{ [[Min]]: 0, [[Max]]: +∞ } 반환.

QuantifierPrefix

{ [[Min]]: 1, [[Max]]: +∞ } 반환.

QuantifierPrefix

{ [[Min]]: 0, [[Max]]: 1 } 반환.

QuantifierPrefix

{

DecimalDigits

}

i 를 DecimalDigits 의 MV ( 12.9.3 참조 ) 로 둔다.
{ [[Min]]: i, [[Max]]: i } 반환.

QuantifierPrefix

{

DecimalDigits

i 를 DecimalDigits 의 MV 로 둔다.
{ [[Min]]: i, [[Max]]: +∞ } 반환.

QuantifierPrefix

{

DecimalDigits

}

i 를 첫 번째 DecimalDigits 의 MV 로 둔다.
j 를 두 번째 DecimalDigits 의 MV 로 둔다.
{ [[Min]]: i, [[Max]]: j } 반환.

22.2.2.7 런타임 의미론: CompileAtom : Matcher

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS.

Note 1

이 절은 B.1.2.7 에서 수정된다.

It is defined piecewise over the following productions:

Atom

PatternCharacter

ch 를 PatternCharacter 가 매치한 문자로 둔다.
A 를 문자 ch 하나를 포함하는 1요소 CharSet 으로 둔다.
CharacterSetMatcher(rer, A, false, direction) 반환.

Atom

A 를 AllCharacters(rer) 로 둔다.
rer.[[DotAll]] 이 true 가 아니면
1. LineTerminator 생성 규칙 우변의 코드 포인트에 대응하는 문자들을 A 에서 제거한다.
CharacterSetMatcher(rer, A, false, direction) 반환.

Atom

CharacterClass

cc 를 CharacterClass 에 대해 인수 rer 로 CompileCharacterClass 한 결과로 둔다.
cs 를 cc.[[CharSet]] 로 둔다.
rer.[[UnicodeSets]] 가 false 이거나, cs 의 모든 CharSetElement 가 단일 문자(또는 cs 가 비어 있음)로 구성된다면 CharacterSetMatcher(rer, cs, cc.[[Invert]], direction) 반환.
Assert: cc.[[Invert]] 는 false.
lm 을 빈 Matcher 리스트로 둔다.
cs 내에서 2 문자 이상인 각 CharSetElement s 에 대해 길이 내림차순으로 반복:
1. cs2 를 s 의 마지막 코드 포인트 하나를 포함하는 1요소 CharSet 으로 둔다.
2. m2 를 CharacterSetMatcher(rer, cs2, false, direction) 로 둔다.
3. s 의 뒤에서 두 번째부터 역순으로 각 코드 포인트 c1 에 대해:
  1. cs1 을 코드 포인트 c1 하나를 포함하는 1요소 CharSet 으로 둔다.
  2. m1 을 CharacterSetMatcher(rer, cs1, false, direction) 로 둔다.
  3. m2 = MatchSequence(m1, m2, direction) 로 설정.
4. m2 를 lm 에 추가.
singles 를 단일 문자로 구성된 cs 의 모든 CharSetElement 를 포함하는 CharSet 으로 둔다.
CharacterSetMatcher(rer, singles, false, direction) 를 lm 에 추가.
cs 가 빈 문자 시퀀스를 포함하면 EmptyMatcher() 를 lm 에 추가.
m2 를 lm 의 마지막 Matcher 로 둔다.
lm 의 뒤에서 두 번째 요소부터 역순으로 각 Matcher m1 에 대해:
1. m2 = MatchTwoAlternatives(m1, m2) 로 설정.
m2 반환.

Atom

(

GroupSpecifier

opt

Disjunction

)

m 을 Disjunction 에 대해 인수 rer, direction 으로 CompileSubpattern 한 결과로 둔다.
parenIndex 를 CountLeftCapturingParensBefore(Atom) 로 둔다.
direction, m, parenIndex 를 캡처하고 매개변수 (x, c) 를 가지며 호출 시 다음을 수행하는 새 Matcher 반환:
1. Assert: x 는 MatchState.
2. Assert: c 는 MatcherContinuation.
3. d 를 매개변수 (y) 를 가지며 x, c, direction, parenIndex 를 캡처하고 호출 시 다음을 수행하는 새 MatcherContinuation 으로 둔다:
  1. Assert: y 는 MatchState.
  2. cap 을 y.[[Captures]] 의 복사본으로 둔다.
  3. Input 을 x.[[Input]] 으로 둔다.
  4. xe 를 x.[[EndIndex]] 로 둔다.
  5. ye 를 y.[[EndIndex]] 로 둔다.
  6. direction 이 forward 이면
    1. Assert: xe ≤ ye.
    2. r 를 CaptureRange { [[StartIndex]]: xe, [[EndIndex]]: ye } 로 둔다.
  7. 그렇지 않으면
    1. Assert: direction 은 backward.
    2. Assert: ye ≤ xe.
    3. r 를 CaptureRange { [[StartIndex]]: ye, [[EndIndex]]: xe } 로 둔다.
  8. cap[parenIndex + 1] = r 로 설정.
  9. z 를 MatchState { [[Input]]: Input, [[EndIndex]]: ye, [[Captures]]: cap } 로 둔다.
  10. c(z) 반환.
4. m(x, d) 반환.

Note 2

( Disjunction ) 형태의 괄호는 Disjunction 패턴 구성요소를 그룹화하고 매치 결과를 저장한다. 결과는 역참조(\ + 0이 아닌 10진수), 치환 문자열(replace String)에서 참조, 또는 정규 표현식 매칭 추상 클로저가 반환하는 배열의 일부로 사용될 수 있다. 괄호의 캡처 동작을 억제하려면 (?: Disjunction ) 형태를 사용한다.

Atom

RegularExpressionModifiers

Disjunction

)

addModifiers 를 RegularExpressionModifiers 가 매치한 소스 텍스트로 둔다.
removeModifiers 를 빈 문자열로 둔다.
modifiedRer 를 UpdateModifiers(rer, CodePointsToString(addModifiers), removeModifiers) 로 둔다.
Disjunction 에 대해 인수 modifiedRer, direction 으로 CompileSubpattern 한 결과 반환.

Atom

RegularExpressionModifiers

Disjunction

)

addModifiers 를 첫 번째 RegularExpressionModifiers 가 매치한 소스 텍스트로 둔다.
removeModifiers 를 두 번째 RegularExpressionModifiers 가 매치한 소스 텍스트로 둔다.
modifiedRer 를 UpdateModifiers(rer, CodePointsToString(addModifiers), CodePointsToString(removeModifiers)) 로 둔다.
Disjunction 에 대해 인수 modifiedRer, direction 으로 CompileSubpattern 한 결과 반환.

AtomEscape

DecimalEscape

n 을 DecimalEscape 의 CapturingGroupNumber 로 둔다.
Assert: n ≤ rer.[[CapturingGroupsCount]].
BackreferenceMatcher(rer, « n », direction) 반환.

Note 3

\ + 0이 아닌 10진수 n 형태의 이스케이프는 n^번째 캡처 괄호의 결과(22.2.2.1)와 매치된다. 정규 표현식이 n 개보다 적은 캡처 괄호를 가지면 오류이다. 정규 표현식이 n 개 이상 캡처 괄호를 가지지만 n^번째 괄호가 아무 것도 캡처하지 않아 undefined 이면 해당 역참조는 항상 성공한다.

AtomEscape

CharacterEscape

cv 를 CharacterEscape 의 CharacterValue 로 둔다.
ch 를 문자 값이 cv 인 문자로 둔다.
A 를 문자 ch 하나를 포함하는 1요소 CharSet 으로 둔다.
CharacterSetMatcher(rer, A, false, direction) 반환.

AtomEscape

CharacterClassEscape

cs 를 CharacterClassEscape 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
rer.[[UnicodeSets]] 가 false 이거나 cs 의 모든 CharSetElement 가 단일 문자(또는 cs 가 비어 있음)라면 CharacterSetMatcher(rer, cs, false, direction) 반환.
lm 을 빈 Matcher 리스트로 둔다.
cs 내 2 문자 이상인 각 CharSetElement s 에 대해 길이 내림차순 반복:
1. cs2 를 s 의 마지막 코드 포인트를 포함하는 CharSet 으로 둔다.
2. m2 를 CharacterSetMatcher(rer, cs2, false, direction) 로 둔다.
3. s 의 뒤에서 두 번째부터 역순으로 각 코드 포인트 c1 에 대해:
  1. cs1 를 코드 포인트 c1 포함하는 CharSet 으로 둔다.
  2. m1 을 CharacterSetMatcher(rer, cs1, false, direction) 로 둔다.
  3. m2 = MatchSequence(m1, m2, direction) 로 설정.
4. m2 를 lm 에 추가.
singles 를 단일 문자 CharSetElement 로 이루어진 CharSet 으로 둔다.
CharacterSetMatcher(rer, singles, false, direction) 를 lm 에 추가.
cs 가 빈 문자 시퀀스를 포함하면 EmptyMatcher() 를 lm 에 추가.
m2 를 lm 의 마지막 Matcher 로 둔다.
lm 의 뒤에서 두 번째부터 역순으로 각 Matcher m1 에 대해
1. m2 = MatchTwoAlternatives(m1, m2) 로 설정.
m2 반환.

AtomEscape

GroupName

matchingGroupSpecifiers 를 GroupSpecifiersThatMatch(GroupName) 로 둔다.
parenIndices 를 새 빈 리스트로 둔다.
matchingGroupSpecifiers 의 각 GroupSpecifier groupSpecifier 에 대해:
1. parenIndex 를 CountLeftCapturingParensBefore(groupSpecifier) 로 둔다.
2. parenIndex 를 parenIndices 에 추가.
BackreferenceMatcher(rer, parenIndices, direction) 반환.

22.2.2.7.1 CharacterSetMatcher ( `rer`, `A`, `invert`, `direction` )

The abstract operation CharacterSetMatcher takes arguments rer (RegExp Record), A (CharSet), invert (Boolean), and direction (forward 또는 backward) and returns Matcher. It performs the following steps when called:

rer.[[UnicodeSets]] 가 true 이면
1. Assert: invert 는 false.
2. Assert: A 의 모든 CharSetElement 는 단일 문자.
rer, A, invert, direction 을 캡처하고 매개변수 (x, c) 를 가지며 호출 시 다음을 수행하는 새 Matcher 반환:
1. Assert: x 는 MatchState.
2. Assert: c 는 MatcherContinuation.
3. Input 을 x.[[Input]] 로 둔다.
4. e 를 x.[[EndIndex]] 로 둔다.
5. direction 이 forward 이면 f = e + 1.
6. 그렇지 않으면 f = e - 1.
7. InputLength 를 Input 요소 수로 둔다.
8. f < 0 또는 f > InputLength 이면 failure 반환.
9. index = min(e, f) 로 둔다.
10. ch 를 문자 Input[index] 로 둔다.
11. cc 를 Canonicalize(rer, ch) 로 둔다.
12. A 내에서 정확히 한 문자를 포함하는 CharSetElement 중 Canonicalize(rer, a) 가 cc 인 a 가 존재하면 found = true; 아니면 false.
13. invert = false 이고 found = false 이면 failure 반환.
14. invert = true 이고 found = true 이면 failure 반환.
15. cap 을 x.[[Captures]] 로 둔다.
16. y 를 MatchState { [[Input]]: Input, [[EndIndex]]: f, [[Captures]]: cap } 로 둔다.
17. c(y) 반환.

22.2.2.7.2 BackreferenceMatcher ( `rer`, `ns`, `direction` )

The abstract operation BackreferenceMatcher takes arguments rer (RegExp Record), ns (양의 정수 리스트), and direction (forward 또는 backward) and returns Matcher. It performs the following steps when called:

rer, ns, direction 을 캡처하고 매개변수 (x, c) 를 가지며 호출 시 다음을 수행하는 새 Matcher 반환:
1. Assert: x 는 MatchState.
2. Assert: c 는 MatcherContinuation.
3. Input 을 x.[[Input]] 로 둔다.
4. cap 을 x.[[Captures]] 로 둔다.
5. r 를 undefined 로 둔다.
6. ns 의 각 정수 n 에 대해:
  1. cap[n] 이 undefined 가 아니면
    1. Assert: r 는 undefined.
    2. r = cap[n] 로 설정.
7. r 가 undefined 이면 c(x) 반환.
8. e 를 x.[[EndIndex]] 로 둔다.
9. rs 를 r.[[StartIndex]] 로 둔다.
10. re 를 r.[[EndIndex]] 로 둔다.
11. len = re - rs 로 둔다.
12. direction 이 forward 이면 f = e + len, 아니면 f = e - len.
13. InputLength 를 Input 요소 수로 둔다.
14. f < 0 또는 f > InputLength 이면 failure 반환.
15. g = min(e, f) 로 둔다.
16. 0 ≤ i < len 인 어떤 정수 i 가 Canonicalize(rer, Input[rs + i]) ≠ Canonicalize(rer, Input[g + i]) 이면 failure 반환.
17. y 를 MatchState { [[Input]]: Input, [[EndIndex]]: f, [[Captures]]: cap } 로 둔다.
18. c(y) 반환.

22.2.2.7.3 Canonicalize ( `rer`, `ch` )

The abstract operation Canonicalize takes arguments rer (RegExp Record) and ch (문자) and returns 문자. It performs the following steps when called:

HasEitherUnicodeFlag(rer) 가 true 이고 rer.[[IgnoreCase]] 가 true 이면
1. Unicode Character Database 의 CaseFolding.txt 가 ch 에 대한 simple 또는 common case folding 매핑을 제공하면 그 매핑을 적용한 결과 반환.
2. ch 반환.
rer.[[IgnoreCase]] 가 false 이면 ch 반환.
Assert: ch 는 UTF-16 코드 유닛.
cp 를 수치 값이 ch 의 수치 값인 코드 포인트로 둔다.
u 를 Unicode Default Case Conversion 알고리즘에 따라 toUppercase(« cp ») 로 둔다.
uStr 를 CodePointsToString(u) 로 둔다.
uStr 길이 ≠ 1 이면 ch 반환.
cu 를 uStr 의 단일 코드 유닛 요소로 둔다.
ch 의 수치 값 ≥ 128 이고 cu 의 수치 값 < 128 이면 ch 반환.
cu 반환.

Note

HasEitherUnicodeFlag(rer) 가 true 이고 대소문자 구분이 무시될 때, 모든 문자는 비교 직전에 Unicode Standard 가 제공하는 simple case folding 으로 암묵적으로 폴딩된다. simple 매핑은 항상 단일 코드 포인트로 매핑하므로 ß (U+00DF) 를 ss 나 SS 로 매핑하지 않는다. 그러나 Basic Latin 블록 밖의 코드 포인트를 그 안으로 매핑할 수 있다 (예: ſ → s, K → k). 이런 코드 포인트를 포함하는 문자열은 /[a-z]/ui 같은 정규 표현식에 매치된다.

HasEitherUnicodeFlag(rer) 가 false 인 대소문자 비구분 매칭에서는 toCasefold 대신 Unicode 기본 대문자 변환 toUppercase 기반 매핑을 사용하므로 약간 차이가 난다. 예를 들어 Ω (U+2126) 는 toUppercase 로 자기 자신을 반환하지만 toCasefold 로는 ω (U+03C9) 와 Ω (U+03A9) 와 함께 ω 로 매핑된다. 따라서 "\u2126" 는 /[ω]/ui, /[\u03A9]/ui 에 매치되지만 /[ω]/i, /[\u03A9]/i 에는 매치되지 않는다. 또한 Basic Latin 블록 밖 코드 포인트가 그 안으로 매핑되지 않으므로 "\u017F ſ", "\u212A K" 는 /[a-z]/i 에 매치되지 않는다.

22.2.2.7.4 UpdateModifiers ( `rer`, `add`, `remove` )

The abstract operation UpdateModifiers takes arguments rer (RegExp Record), add (String), and remove (String) and returns RegExp Record. It performs the following steps when called:

Assert: add 와 remove 는 공통 요소가 없다.
ignoreCase = rer.[[IgnoreCase]].
multiline = rer.[[Multiline]].
dotAll = rer.[[DotAll]].
unicode = rer.[[Unicode]].
unicodeSets = rer.[[UnicodeSets]].
capturingGroupsCount = rer.[[CapturingGroupsCount]].
remove 가 "i" 포함하면 ignoreCase = false.
Else add 가 "i" 포함하면 ignoreCase = true.
remove 가 "m" 포함하면 multiline = false.
Else add 가 "m" 포함하면 multiline = true.
remove 가 "s" 포함하면 dotAll = false.
Else add 가 "s" 포함하면 dotAll = true.
RegExp Record { [[IgnoreCase]]: ignoreCase, [[Multiline]]: multiline, [[DotAll]]: dotAll, [[Unicode]]: unicode, [[UnicodeSets]]: unicodeSets, [[CapturingGroupsCount]]: capturingGroupsCount } 반환.

22.2.2.8 런타임 의미론: CompileCharacterClass 필드를 가진 레코드

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS. It is defined piecewise over the following productions:

CharacterClass

[

ClassContents

]

A 를 ClassContents 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
{ [[CharSet]]: A, [[Invert]]: false } 반환.

CharacterClass

ClassContents

]

A 를 ClassContents 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
rer.[[UnicodeSets]] 가 true 이면
1. { [[CharSet]]: CharacterComplement(rer, A), [[Invert]]: false } 반환.
{ [[CharSet]]: A, [[Invert]]: true } 반환.

22.2.2.9 런타임 의미론: CompileToCharSet : CharSet

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS.

Note 1

이 절은 B.1.2.8 에서 수정된다.

It is defined piecewise over the following productions:

ClassContents

[empty]

빈 CharSet 반환.

NonemptyClassRanges

ClassAtom

NonemptyClassRangesNoDash

A 를 ClassAtom 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
B 를 NonemptyClassRangesNoDash 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
CharSet A 와 B 의 합집합 반환.

NonemptyClassRanges

ClassAtom

ClassContents

A 를 첫 번째 ClassAtom 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
B 를 두 번째 ClassAtom 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
C 를 ClassContents 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
D 를 CharacterRange(A, B) 로 둔다.
D 와 C 의 합집합 반환.

NonemptyClassRangesNoDash

ClassAtomNoDash

NonemptyClassRangesNoDash

A 를 ClassAtomNoDash 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
B 를 NonemptyClassRangesNoDash 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
CharSet A 와 B 의 합집합 반환.

NonemptyClassRangesNoDash

ClassAtomNoDash

ClassAtom

ClassContents

A 를 ClassAtomNoDash 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
B 를 ClassAtom 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
C 를 ClassContents 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
D 를 CharacterRange(A, B) 로 둔다.
D 와 C 의 합집합 반환.

Note 2

ClassContents 는 하나의 ClassAtom 이나 - 로 구분된 두 ClassAtom 범위로 확장될 수 있다. 후자의 경우 첫 번째와 두 번째 ClassAtom 사이(포함)의 모든 문자를 포함한다. 어느 ClassAtom 도 단일 문자를 나타내지 않으면(예: \w) 또는 첫 번째 ClassAtom 의 문자 값이 두 번째보다 엄격히 크면 오류이다.

Note 3

패턴이 대소문자를 무시하더라도 범위의 두 끝 문자 대소문자는 범위에 속하는 문자를 결정하는 데 여전히 중요하다. 예를 들어 /[E-F]/i 는 E, F, e, f 만 매치하지만 /[E-f]/i 는 Unicode Basic Latin 블록의 모든 대소문자와 [, \, ], ^, _, ` 기호를 매치한다.

Note 4

- 문자는 리터럴로 취급되거나 범위를 나타낼 수 있다. ClassContents 의 첫/마지막 문자, 범위의 시작/끝 경계, 또는 범위 지정 직후에 나타나면 리터럴로 취급된다.

ClassAtom

문자 - (U+002D HYPHEN-MINUS) 하나를 포함하는 CharSet 반환.

ClassAtomNoDash

SourceCharacter

but not one of \ or ] or -

SourceCharacter 가 매치한 문자를 포함하는 CharSet 반환.

ClassEscape

CharacterEscape

cv 를 이 ClassEscape 의 CharacterValue 로 둔다.
c 를 문자 값이 cv 인 문자로 둔다.
문자 c 하나를 포함하는 CharSet 반환.

Note 5

ClassAtom 은 정규 표현식 다른 부분에서 허용되는 이스케이프 대부분을 사용할 수 있으나 \b, \B, 역참조는 제외된다. CharacterClass 내부에서 \b 는 백스페이스 문자를 의미하며, \B 와 역참조는 오류이다. ClassAtom 내에서 역참조를 사용하면 오류가 발생한다.

CharacterClassEscape

문자 0..9 10개를 포함하는 CharSet 반환.

CharacterClassEscape

S 를 CharacterClassEscape :: d 가 반환한 CharSet 으로 둔다.
CharacterComplement(rer, S) 반환.

CharacterClassEscape

WhiteSpace 또는 LineTerminator 생성 규칙 우변의 코드 포인트에 대응하는 모든 문자를 포함하는 CharSet 반환.

CharacterClassEscape

S 를 CharacterClassEscape :: s 가 반환한 CharSet 으로 둔다.
CharacterComplement(rer, S) 반환.

CharacterClassEscape

MaybeSimpleCaseFolding(rer, WordCharacters(rer)) 반환.

CharacterClassEscape

S 를 CharacterClassEscape :: w 가 반환한 CharSet 으로 둔다.
CharacterComplement(rer, S) 반환.

CharacterClassEscape

UnicodePropertyValueExpression

}

UnicodePropertyValueExpression 에 대해 인수 rer 로 CompileToCharSet 한 결과 반환.

CharacterClassEscape

UnicodePropertyValueExpression

}

S 를 UnicodePropertyValueExpression 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
Assert: S 는 단일 코드 포인트만 포함한다.
CharacterComplement(rer, S) 반환.

UnicodePropertyValueExpression

UnicodePropertyName

UnicodePropertyValue

ps에 UnicodePropertyName에 의해 매치된 소스 텍스트를 할당한다.
p에 UnicodeMatchProperty(rer, ps)의 결과를 할당한다.
단언: p는 Table 67의 “Property name and aliases” 열에 나열된 Unicode property name 또는 속성 별칭이다.
vs에 UnicodePropertyValue에 의해 매치된 소스 텍스트를 할당한다.
v에 UnicodeMatchPropertyValue(p, vs)의 결과를 할당한다.
A에 속성 데이터베이스 정의에 속성 p와 값 v가 포함된 모든 유니코드 코드 포인트를 포함하는 CharSet을 할당한다.
MaybeSimpleCaseFolding(rer, A)를 반환한다.

UnicodePropertyValueExpression

LoneUnicodePropertyNameOrValue

s에 LoneUnicodePropertyNameOrValue에 의해 매치된 소스 텍스트를 할당한다.
UnicodeMatchPropertyValue(General_Category, s)가 PropertyValueAliases.txt에 나열된 General_Category (gc) 속성에 대한 유니코드 속성 값 또는 속성 값 별칭인 경우,
1. 속성 데이터베이스 정의에 속성 “General_Category”와 값 s가 포함된 모든 유니코드 코드 포인트를 포함하는 CharSet을 반환한다.
p에 UnicodeMatchProperty(rer, s)의 결과를 할당한다.
단언: p는 Table 68의 “Property name and aliases” 열에 나열된 바이너리 유니코드 속성 또는 속성 별칭이거나, Table 69의 “Property name” 열에 나열된 문자열의 바이너리 유니코드 속성이다.
A에 속성 데이터베이스 정의에 속성 p와 값 “True”가 포함된 모든 CharSetElement를 포함하는 CharSet을 할당한다.
MaybeSimpleCaseFolding(rer, A)를 반환한다.

ClassUnion

ClassSetRange

ClassUnion

opt

A 를 ClassSetRange 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
ClassUnion 이 존재하면
1. B 를 ClassUnion 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
2. CharSet A, B 의 합집합 반환.
A 반환.

ClassUnion

ClassSetOperand

ClassUnion

opt

A 를 ClassSetOperand 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
ClassUnion 존재하면
1. B 를 ClassUnion 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
2. CharSet A, B 합집합 반환.
A 반환.

ClassIntersection

ClassSetOperand

A 를 첫 번째 ClassSetOperand 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
B 를 두 번째 ClassSetOperand 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
CharSet A, B 교집합 반환.

ClassIntersection

ClassSetOperand

A 를 ClassIntersection 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
B 를 ClassSetOperand 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
CharSet A, B 교집합 반환.

ClassSubtraction

ClassSetOperand

A 를 첫 번째 ClassSetOperand 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
B 를 두 번째 ClassSetOperand 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
A 중 B 에 속하지 않는 CharSetElement 들만 포함하는 CharSet 반환.

ClassSubtraction

ClassSetOperand

A 를 ClassSubtraction 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
B 를 ClassSetOperand 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
A 중 B 에 속하지 않는 CharSetElement 들만 포함하는 CharSet 반환.

ClassSetRange

ClassSetCharacter

A 를 첫 번째 ClassSetCharacter 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
B 를 두 번째 ClassSetCharacter 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
MaybeSimpleCaseFolding(rer, CharacterRange(A, B)) 반환.

Note 6

결과는 종종 두 개 이상의 범위로 구성된다. UnicodeSets true, IgnoreCase true 인 경우 MaybeSimpleCaseFolding(rer, [Ā-č]) 는 해당 범위의 홀수 코드 포인트만 포함할 수 있다.

ClassSetOperand

ClassSetCharacter

A 를 ClassSetCharacter 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
MaybeSimpleCaseFolding(rer, A) 반환.

ClassSetOperand

ClassStringDisjunction

A 를 ClassStringDisjunction 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
MaybeSimpleCaseFolding(rer, A) 반환.

ClassSetOperand

NestedClass

NestedClass 에 대해 인수 rer 로 CompileToCharSet 한 결과 반환.

NestedClass

[

ClassContents

]

ClassContents 에 대해 인수 rer 로 CompileToCharSet 한 결과 반환.

NestedClass

ClassContents

]

A 를 ClassContents 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
CharacterComplement(rer, A) 반환.

NestedClass

CharacterClassEscape

CharacterClassEscape 에 대해 인수 rer 로 CompileToCharSet 한 결과 반환.

ClassStringDisjunction

\q{

ClassStringDisjunctionContents

}

ClassStringDisjunctionContents 에 대해 인수 rer 로 CompileToCharSet 한 결과 반환.

ClassStringDisjunctionContents

ClassString

s 를 ClassString 에 대해 인수 rer 로 CompileClassSetString 한 결과로 둔다.
문자열 s 하나를 포함하는 CharSet 반환.

ClassStringDisjunctionContents

ClassString

ClassStringDisjunctionContents

s 를 ClassString 에 대해 인수 rer 로 CompileClassSetString 한 결과로 둔다.
A 를 문자열 s 하나를 포함하는 CharSet 으로 둔다.
B 를 ClassStringDisjunctionContents 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
CharSet A, B 합집합 반환.

ClassSetCharacter

SourceCharacter

but not ClassSetSyntaxCharacter

CharacterEscape

ClassSetReservedPunctuator

cv 를 이 ClassSetCharacter 의 CharacterValue 로 둔다.
c 를 문자 값이 cv 인 문자로 둔다.
문자 c 하나를 포함하는 CharSet 반환.

ClassSetCharacter

U+0008 (BACKSPACE) 문자 하나를 포함하는 CharSet 반환.

22.2.2.9.1 CharacterRange ( `A`, `B` )

The abstract operation CharacterRange takes arguments A (CharSet) and B (CharSet) and returns CharSet. It performs the following steps when called:

Assert: A, B 는 각각 정확히 한 문자만 포함.
a 를 CharSet A 의 그 문자로 둔다.
b 를 CharSet B 의 그 문자로 둔다.
i 를 문자 a 의 문자 값으로 둔다.
j 를 문자 b 의 문자 값으로 둔다.
Assert: i ≤ j.
문자 값이 i..j (포함) 범위에 속하는 모든 문자를 포함하는 CharSet 반환.

22.2.2.9.2 HasEitherUnicodeFlag ( `rer` )

The abstract operation HasEitherUnicodeFlag takes argument rer (RegExp Record) and returns Boolean. It performs the following steps when called:

rer.[[Unicode]] 가 true 이거나 rer.[[UnicodeSets]] 가 true 이면
1. true 반환.
false 반환.

22.2.2.9.3 WordCharacters ( `rer` )

The abstract operation WordCharacters takes argument rer (RegExp Record) and returns CharSet. \b, \B, \w, \W 판단에 사용되는 “단어 문자”들을 포함하는 CharSet 반환 It performs the following steps when called:

basicWordChars 를 ASCII 단어 문자를 모두 포함하는 CharSet 으로 둔다.
extraWordChars 를 basicWordChars 에 없지만 Canonicalize(rer, c) 가 basicWordChars 에 있는 문자 c 전부를 포함하는 CharSet 으로 둔다.
Assert: extraWordChars 는 HasEitherUnicodeFlag(rer) 가 true 이고 rer.[[IgnoreCase]] 가 true 인 경우를 제외하면 비어 있음.
basicWordChars 와 extraWordChars 합집합 반환.

22.2.2.9.4 AllCharacters ( `rer` )

The abstract operation AllCharacters takes argument rer (RegExp Record) and returns CharSet. 정규 표현식 플래그에 따른 “모든 문자” 집합을 반환한다. It performs the following steps when called:

rer.[[UnicodeSets]] true, rer.[[IgnoreCase]] true 이면
1. Simple Case Folding 매핑이 없는(즉 scf(c) = c) 모든 유니코드 코드 포인트 c 를 포함하는 CharSet 반환.
Else HasEitherUnicodeFlag(rer) true 이면
1. 모든 코드 포인트 값을 포함하는 CharSet 반환.
Else
1. 모든 코드 유닛 값을 포함하는 CharSet 반환.

22.2.2.9.5 MaybeSimpleCaseFolding ( `rer`, `A` )

The abstract operation MaybeSimpleCaseFolding takes arguments rer (RegExp Record) and A (CharSet) and returns CharSet. rer.[[UnicodeSets]] 가 false 이거나 rer.[[IgnoreCase]] 가 false 이면 A 를 반환한다. 그렇지 않으면 Unicode Character Database 의 Simple Case Folding (scf(cp)) 정의(각각 단일 코드 포인트를 다른 단일 코드 포인트로 매핑)를 사용하여 A 의 각 CharSetElement 를 문자 단위로 정규 형태로 매핑한 결과 CharSet 을 반환한다. It performs the following steps when called:

rer.[[UnicodeSets]] false 또는 rer.[[IgnoreCase]] false 이면 A 반환.
B 를 새 빈 CharSet 으로 둔다.
A 의 각 CharSetElement s 에 대해:
1. t 를 빈 문자 시퀀스로 둔다.
2. s 의 각 단일 코드 포인트 cp 에 대해:
  1. scf(cp) 를 t 에 추가.
3. t 를 B 에 추가.
B 반환.

22.2.2.9.6 CharacterComplement ( `rer`, `S` )

The abstract operation CharacterComplement takes arguments rer (RegExp Record) and S (CharSet) and returns CharSet. It performs the following steps when called:

A 를 AllCharacters(rer) 로 둔다.
A 중 S 에 속하지 않는 CharSetElement 들만 포함하는 CharSet 반환.

22.2.2.9.7 UnicodeMatchProperty ( `rer`, `p` )

The abstract operation UnicodeMatchProperty takes arguments rer (정규식 레코드) and p (ECMAScript 소스 텍스트) and returns Unicode property name. It performs the following steps when called:

rer.[[UnicodeSets]]가 true이고 p가 Table 69의 “Property name” 열에 나열된 Unicode property name인 경우,
1. 유니코드 코드 포인트 p의 리스트를 반환한다.
단언: p는 Table 67 또는 Table 68의 “Property name and aliases” 열에 나열된 Unicode property name 또는 속성 별칭이다.
c에 해당 행의 “Canonical property name” 열에 주어진 p의 표준 property name을 할당한다.
유니코드 코드 포인트 c의 리스트를 반환한다.

구현체는 Table 67, Table 68, Table 69에 나열된 Unicode property names와 별칭을 지원해야 한다. 상호 운용성을 보장하기 위해, 구현체는 다른 property names나 별칭을 지원해서는 안 된다.

Note 1

예를 들어, Script_Extensions(property name)와 scx(속성 별칭)는 유효하지만, script_extensions나 Scx는 유효하지 않다.

Note 2

나열된 속성들은 UTS18 RL1.2가 요구하는 것보다 더 넓은 집합을 이룬다.

Note 3

이 표의 항목 표기(대소문자 포함)는 유니코드 문자 데이터베이스의 PropertyAliases.txt 파일에서 사용된 표기와 일치한다. 해당 파일의 정확한 표기는 안정적으로 보장된다.

Table 67: Non-binary Unicode property aliases and their canonical property names

Property name and aliases	Canonical property name
`General_Category`	`General_Category`
`gc`	`General_Category`
`Script`	`Script`
`sc`	`Script`
`Script_Extensions`	`Script_Extensions`
`scx`	`Script_Extensions`

Table 68: Binary Unicode property aliases and their canonical property names

Property name and aliases	Canonical property name
`ASCII`	`ASCII`
`ASCII_Hex_Digit`	`ASCII_Hex_Digit`
`AHex`	`ASCII_Hex_Digit`
`Alphabetic`	`Alphabetic`
`Alpha`	`Alphabetic`
`Any`	`Any`
`Assigned`	`Assigned`
`Bidi_Control`	`Bidi_Control`
`Bidi_C`	`Bidi_Control`
`Bidi_Mirrored`	`Bidi_Mirrored`
`Bidi_M`	`Bidi_Mirrored`
`Case_Ignorable`	`Case_Ignorable`
`CI`	`Case_Ignorable`
`Cased`	`Cased`
`Changes_When_Casefolded`	`Changes_When_Casefolded`
`CWCF`	`Changes_When_Casefolded`
`Changes_When_Casemapped`	`Changes_When_Casemapped`
`CWCM`	`Changes_When_Casemapped`
`Changes_When_Lowercased`	`Changes_When_Lowercased`
`CWL`	`Changes_When_Lowercased`
`Changes_When_NFKC_Casefolded`	`Changes_When_NFKC_Casefolded`
`CWKCF`	`Changes_When_NFKC_Casefolded`
`Changes_When_Titlecased`	`Changes_When_Titlecased`
`CWT`	`Changes_When_Titlecased`
`Changes_When_Uppercased`	`Changes_When_Uppercased`
`CWU`	`Changes_When_Uppercased`
`Dash`	`Dash`
`Default_Ignorable_Code_Point`	`Default_Ignorable_Code_Point`
`DI`	`Default_Ignorable_Code_Point`
`Deprecated`	`Deprecated`
`Dep`	`Deprecated`
`Diacritic`	`Diacritic`
`Dia`	`Diacritic`
`Emoji`	`Emoji`
`Emoji_Component`	`Emoji_Component`
`EComp`	`Emoji_Component`
`Emoji_Modifier`	`Emoji_Modifier`
`EMod`	`Emoji_Modifier`
`Emoji_Modifier_Base`	`Emoji_Modifier_Base`
`EBase`	`Emoji_Modifier_Base`
`Emoji_Presentation`	`Emoji_Presentation`
`EPres`	`Emoji_Presentation`
`Extended_Pictographic`	`Extended_Pictographic`
`ExtPict`	`Extended_Pictographic`
`Extender`	`Extender`
`Ext`	`Extender`
`Grapheme_Base`	`Grapheme_Base`
`Gr_Base`	`Grapheme_Base`
`Grapheme_Extend`	`Grapheme_Extend`
`Gr_Ext`	`Grapheme_Extend`
`Hex_Digit`	`Hex_Digit`
`Hex`	`Hex_Digit`
`IDS_Binary_Operator`	`IDS_Binary_Operator`
`IDSB`	`IDS_Binary_Operator`
`IDS_Trinary_Operator`	`IDS_Trinary_Operator`
`IDST`	`IDS_Trinary_Operator`
`ID_Continue`	`ID_Continue`
`IDC`	`ID_Continue`
`ID_Start`	`ID_Start`
`IDS`	`ID_Start`
`Ideographic`	`Ideographic`
`Ideo`	`Ideographic`
`Join_Control`	`Join_Control`
`Join_C`	`Join_Control`
`Logical_Order_Exception`	`Logical_Order_Exception`
`LOE`	`Logical_Order_Exception`
`Lowercase`	`Lowercase`
`Lower`	`Lowercase`
`Math`	`Math`
`Noncharacter_Code_Point`	`Noncharacter_Code_Point`
`NChar`	`Noncharacter_Code_Point`
`Pattern_Syntax`	`Pattern_Syntax`
`Pat_Syn`	`Pattern_Syntax`
`Pattern_White_Space`	`Pattern_White_Space`
`Pat_WS`	`Pattern_White_Space`
`Quotation_Mark`	`Quotation_Mark`
`QMark`	`Quotation_Mark`
`Radical`	`Radical`
`Regional_Indicator`	`Regional_Indicator`
`RI`	`Regional_Indicator`
`Sentence_Terminal`	`Sentence_Terminal`
`STerm`	`Sentence_Terminal`
`Soft_Dotted`	`Soft_Dotted`
`SD`	`Soft_Dotted`
`Terminal_Punctuation`	`Terminal_Punctuation`
`Term`	`Terminal_Punctuation`
`Unified_Ideograph`	`Unified_Ideograph`
`UIdeo`	`Unified_Ideograph`
`Uppercase`	`Uppercase`
`Upper`	`Uppercase`
`Variation_Selector`	`Variation_Selector`
`VS`	`Variation_Selector`
`White_Space`	`White_Space`
`space`	`White_Space`
`XID_Continue`	`XID_Continue`
`XIDC`	`XID_Continue`
`XID_Start`	`XID_Start`
`XIDS`	`XID_Start`

Table 69: Binary Unicode properties of strings

Property name
`Basic_Emoji`
`Emoji_Keycap_Sequence`
`RGI_Emoji_Modifier_Sequence`
`RGI_Emoji_Flag_Sequence`
`RGI_Emoji_Tag_Sequence`
`RGI_Emoji_ZWJ_Sequence`
`RGI_Emoji`

22.2.2.9.8 UnicodeMatchPropertyValue ( `p`, `v` )

The abstract operation UnicodeMatchPropertyValue takes arguments p (ECMAScript 소스 텍스트) and v (ECMAScript 소스 텍스트) and returns 유니코드 속성 값. It performs the following steps when called:

단언: p는 Table 67의 “Canonical property name” 열에 나열된 표준, 별칭이 아닌 Unicode property name이다.
단언: v는 PropertyValueAliases.txt에 나열된 유니코드 속성 p에 대한 속성 값 또는 속성 값 별칭이다.
value에 해당 행의 “Canonical property value” 열에 주어진 v의 표준 속성 값을 할당한다.
유니코드 코드 포인트 value의 리스트를 반환한다.

구현체는 Table 67에 나열된 속성에 대해 PropertyValueAliases.txt에 나열된 유니코드 속성 값 및 속성 값 별칭을 지원해야 한다. 상호 운용성을 보장하기 위해, 구현체는 다른 속성 값이나 속성 값 별칭을 지원해서는 안 된다.

Note 1

예를 들어, Xpeo와 Old_Persian은 Script_Extensions의 유효한 값이지만, xpeo와 Old Persian은 유효하지 않다.

Note 2

이 알고리즘은 UAX44에 나열된 심볼릭 값 일치 규칙과 다르다: 대소문자, 공백, U+002D(하이픈-마이너스), U+005F(로우 라인)은 무시되지 않으며, Is 접두사는 지원되지 않는다.

22.2.2.10 런타임 의미론: CompileClassSetString : 문자 시퀀스

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS. It is defined piecewise over the following productions:

ClassString

[empty]

빈 문자 시퀀스 반환.

ClassString

NonEmptyClassString

NonEmptyClassString 에 대해 인수 rer 로 CompileClassSetString 한 결과 반환.

NonEmptyClassString

ClassSetCharacter

NonEmptyClassString

opt

cs 를 ClassSetCharacter 에 대해 인수 rer 로 CompileToCharSet 한 결과로 둔다.
s1 을 cs 의 단일 CharSetElement 인 문자 시퀀스로 둔다.
NonEmptyClassString 이 존재하면
1. s2 를 NonEmptyClassString 에 대해 인수 rer 로 CompileClassSetString 한 결과로 둔다.
2. s1 과 s2 의 연결(concatenation) 반환.
s1 반환.

22.2.3 RegExp 생성용 추상 연산 (Abstract Operations for RegExp Creation)

22.2.3.1 RegExpCreate ( `P`, `F` )

The abstract operation RegExpCreate takes arguments P (ECMAScript 언어 값) and F (String 또는 undefined) and returns Object 를 포함하는 정상 완료(normal completion) 또는 throw completion. It performs the following steps when called:

obj 를 ! RegExpAlloc(%RegExp%) 의 결과로 둔다.
? RegExpInitialize(obj, P, F) 를 반환한다.

22.2.3.2 RegExpAlloc ( `newTarget` )

The abstract operation RegExpAlloc takes argument newTarget (생성자(constructor)) and returns Object 를 포함하는 정상 완료 또는 throw completion. It performs the following steps when called:

obj 를 ? OrdinaryCreateFromConstructor(newTarget, "%RegExp.prototype%", « [[OriginalSource]], [[OriginalFlags]], [[RegExpRecord]], [[RegExpMatcher]] ») 로 둔다.
! DefinePropertyOrThrow(obj, "lastIndex", PropertyDescriptor { [[Writable]]: true, [[Enumerable]]: false, [[Configurable]]: false }) 를 수행한다.
obj 를 반환한다.

22.2.3.3 RegExpInitialize ( `obj`, `pattern`, `flags` )

The abstract operation RegExpInitialize takes arguments obj (Object), pattern (ECMAScript 언어 값), and flags (ECMAScript 언어 값) and returns Object 를 포함하는 정상 완료 또는 throw completion. It performs the following steps when called:

pattern 이 undefined 이면 P 를 빈 String 으로 둔다.
아니면 P 를 ? ToString(pattern) 으로 둔다.
flags 가 undefined 이면 F 를 빈 String 으로 둔다.
아니면 F 를 ? ToString(flags) 로 둔다.
F 가 "d", "g", "i", "m", "s", "u", "v", "y" 외의 코드 유닛을 포함하거나, 어떤 코드 유닛이든 두 번 이상 포함하면 SyntaxError 예외를 throw 한다.
F 가 "i" 를 포함하면 i = true; 아니면 false.
F 가 "m" 를 포함하면 m = true; 아니면 false.
F 가 "s" 를 포함하면 s = true; 아니면 false.
F 가 "u" 를 포함하면 u = true; 아니면 false.
F 가 "v" 를 포함하면 v = true; 아니면 false.
u 가 true 이거나 v 가 true 이면
1. patternText 를 StringToCodePoints(P) 로 둔다.
아니면
1. patternText 를 P 의 각 16-bit 요소를 UTF-16 디코딩 없이 Unicode BMP 코드 포인트로 해석한 결과로 둔다.
parseResult 를 ParsePattern(patternText, u, v) 로 둔다.
parseResult 가 비어 있지 않은 SyntaxError 객체들의 List 이면 SyntaxError 예외를 throw 한다.
Assert: parseResult 는 Pattern Parse Node 이다.
obj.[[OriginalSource]] = P.
obj.[[OriginalFlags]] = F.
capturingGroupsCount 를 CountLeftCapturingParensWithin(parseResult) 로 둔다.
rer 를 RegExp Record { [[IgnoreCase]]: i, [[Multiline]]: m, [[DotAll]]: s, [[Unicode]]: u, [[UnicodeSets]]: v, [[CapturingGroupsCount]]: capturingGroupsCount } 로 둔다.
obj.[[RegExpRecord]] = rer.
obj.[[RegExpMatcher]] = CompilePattern of parseResult with argument rer.
? Set(obj, "lastIndex", +0_𝔽, true) 를 수행한다.
obj 를 반환한다.

22.2.3.4 정적 의미론: ParsePattern ( `patternText`: Unicode 코드 포인트 시퀀스, `u`: Boolean, `v`: Boolean, ): Parse Node 또는 비어 있지 않은 SyntaxError 객체 List

The abstract operation UNKNOWN takes UNPARSEABLE ARGUMENTS.

Note

이 절은 B.1.2.9 에서 수정된다.

It performs the following steps when called:

v 가 true 이고 u 가 true 이면
1. parseResult 를 하나 이상의 SyntaxError 객체를 포함하는 List 로 둔다.
Else if v 가 true 이면
1. parseResult = ParseText(patternText, Pattern[+UnicodeMode, +UnicodeSetsMode, +NamedCaptureGroups]).
Else if u 가 true 이면
1. parseResult = ParseText(patternText, Pattern[+UnicodeMode, ~UnicodeSetsMode, +NamedCaptureGroups]).
Else
1. parseResult = ParseText(patternText, Pattern[~UnicodeMode, ~UnicodeSetsMode, +NamedCaptureGroups]).
parseResult 를 반환한다.

22.2.4 RegExp 생성자 (The RegExp Constructor)

RegExp 생성자는 다음을 만족한다:

%RegExp% 이다.
전역 객체의 "RegExp" 프로퍼티 초기 값이다.
생성자로 호출될 때 새로운 RegExp 객체를 생성·초기화한다.
생성자가 아닌 함수로 호출될 때, 새 RegExp 객체를 반환하거나 인자가 RegExp 객체 하나뿐이면 그 인자를 그대로 반환한다.
클래스 정의의 extends 절 값으로 사용할 수 있다. 지정된 RegExp 동작을 상속하려는 서브클래스 생성자는 필요한 내부 슬롯을 가진 인스턴스를 생성·초기화하기 위해 반드시 super 호출을 포함해야 한다.

22.2.4.1 RegExp ( `pattern`, `flags` )

이 함수가 호출되면 다음 단계를 수행한다:

patternIsRegExp 를 ? IsRegExp(pattern) 로 둔다.
NewTarget 이 undefined 이면
1. newTarget = 활성 함수 객체.
2. patternIsRegExp 가 true 이고 flags 가 undefined 이면
  1. patternConstructor = ? Get(pattern, "constructor").
  2. SameValue(newTarget, patternConstructor) 가 true 이면 pattern 반환.
Else
1. newTarget = NewTarget.
pattern 이 Object 이고 [[RegExpMatcher]] 내부 슬롯을 가지면
1. P = pattern.[[OriginalSource]].
2. flags 가 undefined 이면 F = pattern.[[OriginalFlags]]; 아니면 F = flags.
Else if patternIsRegExp = true 이면
1. P = ? Get(pattern, "source").
2. flags 가 undefined 이면
  1. F = ? Get(pattern, "flags").
3. Else
  1. F = flags.
Else
1. P = pattern.
2. F = flags.
O = ? RegExpAlloc(newTarget).
? RegExpInitialize(O, P, F) 를 반환한다.

Note

패턴이 StringLiteral 로 제공되면, 본 함수가 처리하기 전에 일반적인 이스케이프 시퀀스 치환이 수행된다. 패턴이 이스케이프 시퀀스를 포함해야 인식되는 경우, StringLiteral 구성 시 제거되지 않도록 U+005C (REVERSE SOLIDUS) 코드 포인트는 이중 이스케이프되어야 한다.

22.2.5 RegExp 생성자의 프로퍼티 (Properties of the RegExp Constructor)

RegExp 생성자는 다음을 만족한다:

값이 %Function.prototype% 인 [[Prototype]] 내부 슬롯을 가진다.
다음 프로퍼티들을 가진다:

22.2.5.1 RegExp.escape ( `S` )

이 함수는 정규 표현식 Pattern 내에서 특수 의미가 될 수 있는 문자들을 동등한 이스케이프 시퀀스로 치환한 S 의 복사본을 반환한다.

호출 시 다음 단계를 수행한다:

S 가 String 이 아니면 TypeError 예외를 throw.
escaped 를 빈 String 으로 둔다.
cpList = StringToCodePoints(S).
cpList 의 각 코드 포인트 cp 에 대해
1. escaped 가 빈 String 이고 cp 가 DecimalDigit 또는 AsciiLetter 에 매치되면
  1. NOTE: 선행 숫자를 이스케이프하면 \0 이나 \1 같은 DecimalEscape 뒤에서도 확장으로 해석되지 않고 S 와 매치되도록 보장한다. 선행 ASCII 문자 이스케이프는 \c 이후 문맥에 대해 동일한 목적을 가진다.
  2. numericValue = cp 의 수치 값.
  3. hex = Number::toString(𝔽(numericValue), 16).
  4. Assert: hex 길이는 2.
  5. escaped = 0x005C (REVERSE SOLIDUS) + "x" + hex.
2. Else
  1. escaped = escaped + EncodeForRegExpEscape(cp).
escaped 반환.

Note

이름이 비슷하지만 EscapeRegExpPattern 과 RegExp.escape 는 다른 일을 한다. 전자는 패턴을 문자열로 표현하기 위해 이스케이프하고, 후자는 문자열을 패턴 안에 표현하기 위해 이스케이프한다.

22.2.5.1.1 EncodeForRegExpEscape ( `cp` )

The abstract operation EncodeForRegExpEscape takes argument cp (코드 포인트) and returns String. cp 를 매칭하는 Pattern 을 나타내는 String 을 반환한다. cp 가 공백 또는 ASCII 구두점이면 이스케이프 시퀀스를 반환하고, 그렇지 않으면 cp 자체의 String 표현을 반환한다. It performs the following steps when called:

cp 가 SyntaxCharacter 에 매치되거나 U+002F (SOLIDUS)이면
1. 0x005C (REVERSE SOLIDUS) 와 UTF16EncodeCodePoint(cp) 의 연결을 반환.
Else if cp 가 Table 65 “Code Point” 열에 나온 코드 포인트이면
1. 0x005C (REVERSE SOLIDUS) 와 해당 행 “ControlEscape” 열의 문자열 연결을 반환.
otherPunctuators = ",-=<>#&!%:;@~'`" + 코드 유닛 0x0022 (QUOTATION MARK).
toEscape = StringToCodePoints(otherPunctuators).
toEscape 가 cp 를 포함하거나, cp 가 WhiteSpace 또는 LineTerminator 에 매치되거나, cp 가 리드 서러게이트 또는 트레일 서러게이트와 같은 수치 값을 가지면
1. cpNum = cp 의 수치 값.
2. cpNum ≤ 0xFF 이면
  1. hex = Number::toString(𝔽(cpNum), 16).
  2. 0x005C (REVERSE SOLIDUS) + "x" + StringPad(hex, 2, "0", start) 반환.
3. escaped = 빈 String.
4. codeUnits = UTF16EncodeCodePoint(cp).
5. 각 코드 유닛 cu 에 대해
  1. escaped = escaped + UnicodeEscape(cu).
6. escaped 반환.
UTF16EncodeCodePoint(cp) 반환.

22.2.5.2 RegExp.prototype

RegExp.prototype 의 초기 값은 RegExp 프로토타입 객체이다.

이 프로퍼티는 { [[Writable]]: false, [[Enumerable]]: false, [[Configurable]]: false } 특성을 가진다.

22.2.5.3 get RegExp [ %Symbol.species% ]

RegExp[%Symbol.species%] 는 set 접근자가 undefined 인 접근자 프로퍼티이다. get 접근자는 호출 시 다음을 수행한다:

this 값을 반환한다.

이 함수의 "name" 프로퍼티 값은 "get [Symbol.species]" 이다.

Note

RegExp 프로토타입 메서드는 일반적으로 자신의 this 값의 생성자를 사용해 파생 객체를 만든다. 그러나 서브클래스 생성자는 %Symbol.species% 재정의를 통해 그 기본 동작을 바꿀 수 있다.

22.2.6 RegExp 프로토타입 객체의 프로퍼티 (Properties of the RegExp Prototype Object)

RegExp 프로토타입 객체는 다음을 만족한다:

%RegExp.prototype% 이다.
일반(ordinary) 객체이다.
RegExp 인스턴스가 아니며 [[RegExpMatcher]] 내부 슬롯이나 그 밖의 RegExp 인스턴스 내부 슬롯을 가지지 않는다.
값이 %Object.prototype% 인 [[Prototype]] 내부 슬롯을 가진다.

Note

RegExp 프로토타입 객체는 자체 "valueOf" 프로퍼티를 갖지 않고 Object 프로토타입 객체로부터 상속받는다.

22.2.6.1 RegExp.prototype.constructor

RegExp.prototype.constructor 의 초기 값은 %RegExp% 이다.

22.2.6.2 RegExp.prototype.exec ( `string` )

이 메서드는 string 에서 정규 표현식 패턴의 발생을 검색하고 매치 결과를 담은 Array 또는 매치 실패 시 null 을 반환한다.

호출 시 다음을 수행한다:

R = this 값.
? RequireInternalSlot(R, [[RegExpMatcher]]) 수행.
S = ? ToString(string).
? RegExpBuiltinExec(R, S) 반환.

22.2.6.3 get RegExp.prototype.dotAll

RegExp.prototype.dotAll 은 set 접근자가 undefined 인 접근자 프로퍼티이며 get 접근자는 다음을 수행한다:

R = this 값.
cu = 코드 유닛 0x0073 (LATIN SMALL LETTER S).
? RegExpHasFlag(R, cu) 반환.

22.2.6.4 get RegExp.prototype.flags

RegExp.prototype.flags 는 set 접근자가 undefined 인 접근자 프로퍼티이며 get 접근자는 다음을 수행한다:

R = this 값.
R 이 Object 가 아니면 TypeError 예외 throw.
codeUnits = 새 빈 List.
hasIndices = ToBoolean(? Get(R, "hasIndices")).
hasIndices 가 true 이면 코드 유닛 0x0064 (d) 를 codeUnits 에 추가.
global = ToBoolean(? Get(R, "global")).
global 이 true 이면 0x0067 (g) 추가.
ignoreCase = ToBoolean(? Get(R, "ignoreCase")).
ignoreCase 가 true 이면 0x0069 (i) 추가.
multiline = ToBoolean(? Get(R, "multiline")).
multiline 이 true 이면 0x006D (m) 추가.
dotAll = ToBoolean(? Get(R, "dotAll")).
dotAll 이 true 이면 0x0073 (s) 추가.
unicode = ToBoolean(? Get(R, "unicode")).
unicode 가 true 이면 0x0075 (u) 추가.
unicodeSets = ToBoolean(? Get(R, "unicodeSets")).
unicodeSets 가 true 이면 0x0076 (v) 추가.
sticky = ToBoolean(? Get(R, "sticky")).
sticky 가 true 이면 0x0079 (y) 추가.
codeUnits 요소들로 이루어진 String 반환. codeUnits 비어 있으면 빈 String 반환.

22.2.6.4.1 RegExpHasFlag ( `R`, `codeUnit` )

The abstract operation RegExpHasFlag takes arguments R (ECMAScript 언어 값) and codeUnit (코드 유닛) and returns Boolean 또는 undefined 를 포함하는 정상 완료 또는 throw completion. It performs the following steps when called:

R 이 Object 가 아니면 TypeError 예외.
R 이 [[OriginalFlags]] 내부 슬롯을 갖지 않으면
1. SameValue(R, %RegExp.prototype%) 가 true 이면 undefined 반환.
2. 아니면 TypeError 예외.
flags = R.[[OriginalFlags]].
flags 가 codeUnit 을 포함하면 true 반환.
false 반환.

22.2.6.5 get RegExp.prototype.global

RegExp.prototype.global 접근자 get 은 다음을 수행한다:

R = this 값.
cu = 0x0067 (LATIN SMALL LETTER G).
? RegExpHasFlag(R, cu) 반환.

22.2.6.6 get RegExp.prototype.hasIndices

RegExp.prototype.hasIndices 접근자 get 은 다음을 수행한다:

R = this 값.
cu = 0x0064 (LATIN SMALL LETTER D).
? RegExpHasFlag(R, cu) 반환.

22.2.6.7 get RegExp.prototype.ignoreCase

RegExp.prototype.ignoreCase 접근자 get 은 다음을 수행한다:

R = this 값.
cu = 0x0069 (LATIN SMALL LETTER I).
? RegExpHasFlag(R, cu) 반환.

22.2.6.8 RegExp.prototype [ %Symbol.match% ] ( `string` )

이 메서드는 호출 시 다음을 수행한다:

rx = this 값.
rx 가 Object 가 아니면 TypeError 예외.
S = ? ToString(string).
flags = ? ToString(? Get(rx, "flags")).
flags 가 "g" 를 포함하지 않으면
1. ? RegExpExec(rx, S) 반환.
Else
1. flags 가 "u" 또는 "v" 포함하면 fullUnicode = true; 아니면 false.
2. ? Set(rx, "lastIndex", +0_𝔽, true).
3. A = ! ArrayCreate(0).
4. n = 0.
5. 반복:
  1. result = ? RegExpExec(rx, S).
  2. result 가 null 이면
    1. n = 0 이면 null 반환.
    2. A 반환.
  3. Else
    1. matchStr = ? ToString(? Get(result, "0")).
    2. ! CreateDataPropertyOrThrow(A, ! ToString(𝔽(n)), matchStr).
    3. matchStr 가 빈 String 이면
      1. thisIndex = ℝ(? ToLength(? Get(rx, "lastIndex"))).
      2. nextIndex = AdvanceStringIndex(S, thisIndex, fullUnicode).
      3. ? Set(rx, "lastIndex", 𝔽(nextIndex), true).
    4. n = n + 1.

이 메서드의 "name" 프로퍼티 값은 "[Symbol.match]" 이다.

Note

%Symbol.match% 프로퍼티는 IsRegExp 추상 연산이 객체가 정규 표현식 기본 동작을 가지는지 식별하는 데 사용된다. 해당 프로퍼티가 없거나 Boolean 으로 true 로 강제되지 않는 값이면 정규 표현식 객체로 의도되지 않은 것이다.

22.2.6.9 RegExp.prototype [ %Symbol.matchAll% ] ( `string` )

이 메서드는 호출 시 다음을 수행한다:

R = this 값.
R 이 Object 아니면 TypeError.
S = ? ToString(string).
C = ? SpeciesConstructor(R, %RegExp%).
flags = ? ToString(? Get(R, "flags")).
matcher = ? Construct(C, « R, flags »).
lastIndex = ? ToLength(? Get(R, "lastIndex")).
? Set(matcher, "lastIndex", lastIndex, true).
flags 가 "g" 포함하면 global = true; 아니면 false.
flags 가 "u" 또는 "v" 포함하면 fullUnicode = true; 아니면 false.
CreateRegExpStringIterator(matcher, S, global, fullUnicode) 반환.

이 메서드의 "name" 프로퍼티 값은 "[Symbol.matchAll]" 이다.

22.2.6.10 get RegExp.prototype.multiline

RegExp.prototype.multiline 접근자 get 은 다음을 수행한다:

R = this.
cu = 0x006D (m).
? RegExpHasFlag(R, cu) 반환.

22.2.6.11 RegExp.prototype [ %Symbol.replace% ] ( `string`, `replaceValue` )

이 메서드는 호출 시 다음을 수행한다:

rx = this.
rx Object 아니면 TypeError.
S = ? ToString(string).
lengthS = S 길이.
functionalReplace = IsCallable(replaceValue).
functionalReplace = false 이면 replaceValue = ? ToString(replaceValue).
flags = ? ToString(? Get(rx, "flags")).
flags 가 "g" 포함하면 global = true; 아니면 false.
global = true 이면 ? Set(rx, "lastIndex", +0_𝔽, true).
results = 새 빈 List.
done = false.
done 이 false 인 동안 반복
1. result = ? RegExpExec(rx, S).
2. result = null 이면 done = true.
3. Else
  1. results 에 result 추가.
  2. global = false 이면 done = true.
  3. Else
    1. matchStr = ? ToString(? Get(result, "0")).
    2. matchStr 가 빈 String 이면
      1. thisIndex = ℝ(? ToLength(? Get(rx, "lastIndex")) ).
      2. flags 가 "u" 또는 "v" 포함하면 fullUnicode = true; 아니면 false.
      3. nextIndex = AdvanceStringIndex(S, thisIndex, fullUnicode).
      4. ? Set(rx, "lastIndex", 𝔽(nextIndex), true).
accumulatedResult = 빈 String.
nextSourcePosition = 0.
각 result ∈ results 에 대해
1. resultLength = ? LengthOfArrayLike(result).
2. nCaptures = max(resultLength - 1, 0).
3. matched = ? ToString(? Get(result, "0")).
4. matchLength = matched 길이.
5. position = ? ToIntegerOrInfinity(? Get(result, "index")).
6. position = 0..lengthS 로 clamp.
7. captures = 새 빈 List.
8. n = 1.
9. n ≤ nCaptures 동안
  1. capN = ? Get(result, ! ToString(𝔽(n))).
  2. capN ≠ undefined 이면 capN = ? ToString(capN).
  3. captures 에 capN 추가.
  4. NOTE: n = 1 때 첫 캡처가 인덱스 0에 들어간다.
  5. n = n + 1.
10. namedCaptures = ? Get(result, "groups").
11. functionalReplace = true 이면
  1. replacerArgs = « matched » ⧺ captures ⧺ « 𝔽(position), S ».
  2. namedCaptures ≠ undefined 이면 replacerArgs 끝에 추가.
  3. replacementValue = ? Call(replaceValue, undefined, replacerArgs).
  4. replacementString = ? ToString(replacementValue).
12. Else
  1. namedCaptures ≠ undefined 이면 namedCaptures = ? ToObject(namedCaptures).
  2. replacementString = ? GetSubstitution(matched, S, position, captures, namedCaptures, replaceValue).
13. position ≥ nextSourcePosition 이면
  1. NOTE: position 이 뒤로 가면 비정상 서브클래스 동작일 수 있으며 그 치환은 무시.
  2. accumulatedResult = accumulatedResult + S 의 [nextSourcePosition, position) + replacementString.
  3. nextSourcePosition = position + matchLength.
nextSourcePosition ≥ lengthS 이면 accumulatedResult 반환.
accumulatedResult + S 의 [nextSourcePosition, 끝) 반환.

이 메서드의 "name" 값은 "[Symbol.replace]" 이다.

22.2.6.12 RegExp.prototype [ %Symbol.search% ] ( `string` )

호출 시 다음을 수행한다:

rx = this.
rx Object 아니면 TypeError.
S = ? ToString(string).
previousLastIndex = ? Get(rx, "lastIndex").
previousLastIndex ≠ +0_𝔽 이면 ? Set(rx, "lastIndex", +0_𝔽, true).
result = ? RegExpExec(rx, S).
currentLastIndex = ? Get(rx, "lastIndex").
SameValue(currentLastIndex, previousLastIndex) = false 이면 ? Set(rx, "lastIndex", previousLastIndex, true).
result = null 이면 -1_𝔽 반환.
? Get(result, "index") 반환.

이 메서드의 "name" 값은 "[Symbol.search]" 이다.

Note

검색 수행 시 이 RegExp 객체의 "lastIndex" 및 "global" 프로퍼티는 무시되며 "lastIndex" 는 변경되지 않는다.

22.2.6.13 get RegExp.prototype.source

RegExp.prototype.source 접근자 get 은 다음을 수행한다:

R = this.
R Object 아니면 TypeError.
R 가 [[OriginalSource]] 슬롯 없으면
1. SameValue(R, %RegExp.prototype%) true 이면 "(?:)" 반환.
2. 아니면 TypeError.
Assert: R 는 [[OriginalFlags]] 슬롯을 가진다.
src = R.[[OriginalSource]].
flags = R.[[OriginalFlags]].
EscapeRegExpPattern(src, flags) 반환.

22.2.6.13.1 EscapeRegExpPattern ( `P`, `F` )

The abstract operation EscapeRegExpPattern takes arguments P (String) and F (String) and returns String. It performs the following steps when called:

F 가 "v" 포함하면 patternSymbol = Pattern[+UnicodeMode, +UnicodeSetsMode].
Else if F 가 "u" 포함하면 patternSymbol = Pattern[+UnicodeMode, ~UnicodeSetsMode].
Else patternSymbol = Pattern[~UnicodeMode, ~UnicodeSetsMode].
S 를: P 를 UTF-16 인코딩된 코드 포인트로 해석(6.1.4)한 것과 동등한 patternSymbol 형태의 String 으로 하고, 아래에서 기술한 특정 코드 포인트를 이스케이프한 값으로 둔다. S 는 P 와 다를 수도 동일할 수도 있다. S 를 patternSymbol 로 평가해 얻는 추상 클로저는 객체의 [[RegExpMatcher]] 가 주는 추상 클로저와 동일하게 동작해야 한다. 동일한 P, F 로 여러 번 호출 시 결과는 동일해야 한다.
패턴에 나타나는 / 또는 LineTerminator 는 "/", S, "/", F 의 연결이 동등한 동작의 RegularExpressionLiteral 로 파싱 가능하도록 필요 시 이스케이프해야 한다. 예: P = "/" 이면 S 는 "\/" 또는 "\u002F" 등 가능하나 "/" 는 불가 ( ///+F 는 SingleLineComment 로 파싱 ). P 가 빈 String 이면 S = "(?:)" 로 충족할 수 있다.
S 반환.

Note

RegExp.escape 와 EscapeRegExpPattern 은 목적이 다르다. 전자는 문자열을 패턴 내부 표현용으로, 후자는 패턴을 문자열 표현용으로 이스케이프한다.

22.2.6.14 RegExp.prototype [ %Symbol.split% ] ( `string`, `limit` )

Note 1

이 메서드는 string 을 String 으로 변환한 결과를 왼쪽에서 오른쪽으로 정규 표현식 매치 경계를 찾아 나눈 부분 문자열들을 Array 에 저장해 반환한다. 매치된 부분들은 반환 배열에 포함되지 않고 구분자로만 사용된다.

this 값이 빈 정규식이거나 빈 String 과 매치 가능한 정규식일 수 있다. 이 경우 입력 String 시작·끝 혹은 이전 구분자 매치 끝의 빈 substring 과는 매치하지 않는다. (예: 패턴이 빈 String 과 매치하면 문자열은 개별 코드 유닛 요소로 분해되어 결과 배열 길이는 문자열 길이와 같고 각 substring 은 한 코드 유닛을 가진다.) 특정 인덱스에서 백트래킹으로 비어 있지 않은 매치가 가능하더라도 첫 번째 매치만 고려한다. (예: /a*?/[Symbol.split]("ab") → ["a", "b"], /a*/[Symbol.split]("ab") → ["","b"])

string 이 빈 String (또는 변환 결과가 빈 String)이면 정규식이 빈 String 과 매치 가능한지 여부에 따라 결과가 달라진다. 가능하면 결과 배열은 비어 있고, 불가능하면 결과 배열은 빈 String 하나를 가진다.

정규식이 캡처 괄호를 포함하면 separator 매치 때마다 ( undefined 포함 ) 각 캡처 결과가 출력 배열에 삽입된다. 예:

/<(\/)?([^<>]+)>/[Symbol.split]("A<B>bold</B>and<CODE>coded</CODE>")

결과:

["A", undefined, "B", "bold", "/", "B", "and", undefined, "CODE", "coded", "/", "CODE", ""]

limit 이 undefined 가 아니면 결과 배열은 최대 limit 요소로 잘린다.

호출 시 다음을 수행한다:

rx = this.
rx Object 아니면 TypeError.
S = ? ToString(string).
C = ? SpeciesConstructor(rx, %RegExp%).
flags = ? ToString(? Get(rx, "flags")).
flags 가 "u" 또는 "v" 포함하면 unicodeMatching = true; 아니면 false.
flags 가 "y" 포함하면 newFlags = flags; 아니면 newFlags = flags + "y".
splitter = ? Construct(C, « rx, newFlags »).
A = ! ArrayCreate(0).
lengthA = 0.
limit = undefined 이면 lim = 2³² - 1; 아니면 lim = ℝ(? ToUint32(limit)).
lim = 0 이면 A 반환.
S 가 빈 String 이면
1. z = ? RegExpExec(splitter, S).
2. z ≠ null 이면 A 반환.
3. ! CreateDataPropertyOrThrow(A, "0", S).
4. A 반환.
size = S 길이.
p = 0.
q = p.
q < size 동안 반복
1. ? Set(splitter, "lastIndex", 𝔽(q), true).
2. z = ? RegExpExec(splitter, S).
3. z = null 이면
  1. q = AdvanceStringIndex(S, q, unicodeMatching).
4. Else
  1. e = ℝ(? ToLength(? Get(splitter, "lastIndex")) ).
  2. e = min(e, size).
  3. e = p 이면
    1. q = AdvanceStringIndex(S, q, unicodeMatching).
  4. Else
    1. T = S 의 [p, q) 부분 문자열.
    2. ! CreateDataPropertyOrThrow(A, ! ToString(𝔽(lengthA)), T).
    3. lengthA = lengthA + 1.
    4. lengthA = lim 이면 A 반환.
    5. p = e.
    6. numberOfCaptures = ? LengthOfArrayLike(z).
    7. numberOfCaptures = max(numberOfCaptures - 1, 0).
    8. i = 1.
    9. i ≤ numberOfCaptures 동안
      1. nextCapture = ? Get(z, ! ToString(𝔽(i))).
      2. ! CreateDataPropertyOrThrow(A, ! ToString(𝔽(lengthA)), nextCapture).
      3. i = i + 1.
      4. lengthA = lengthA + 1.
      5. lengthA = lim 이면 A 반환.
    10. q = p.
T = S 의 [p, size) 부분 문자열.
! CreateDataPropertyOrThrow(A, ! ToString(𝔽(lengthA)), T).
A 반환.

이 메서드의 "name" 값은 "[Symbol.split]" 이다.

Note 2

이 메서드는 이 RegExp 객체의 "global", "sticky" 프로퍼티 값을 무시한다.

22.2.6.15 get RegExp.prototype.sticky

RegExp.prototype.sticky 접근자 get 은 다음을 수행한다:

R = this.
cu = 0x0079 (y).
? RegExpHasFlag(R, cu) 반환.

22.2.6.16 RegExp.prototype.test ( `S` )

호출 시 다음을 수행한다:

R = this.
R Object 아니면 TypeError.
string = ? ToString(S).
match = ? RegExpExec(R, string).
match ≠ null 이면 true 반환; 아니면 false.

22.2.6.17 RegExp.prototype.toString ( )

R = this.
R Object 아니면 TypeError.
pattern = ? ToString(? Get(R, "source")).
1. flags = ? ToString(? Get(R, "flags")).
result = "/" + pattern + "/" + flags.
result 반환.

Note

반환된 String 은 동일한 동작을 하는 또 다른 RegExp 객체로 평가되는 RegularExpressionLiteral 형태이다.

22.2.6.18 get RegExp.prototype.unicode

RegExp.prototype.unicode 접근자 get 은 다음을 수행한다:

R = this.
cu = 0x0075 (u).
? RegExpHasFlag(R, cu) 반환.

22.2.6.19 get RegExp.prototype.unicodeSets

RegExp.prototype.unicodeSets 접근자 get 은 다음을 수행한다:

R = this.
cu = 0x0076 (v).
? RegExpHasFlag(R, cu) 반환.

22.2.7 RegExp 매칭용 추상 연산 (Abstract Operations for RegExp Matching)

22.2.7.1 RegExpExec ( `R`, `S` )

The abstract operation RegExpExec takes arguments R (Object) and S (String) and returns Object 또는 null 을 포함하는 정상 완료 또는 throw completion. It performs the following steps when called:

exec = ? Get(R, "exec").
IsCallable(exec) = true 이면
1. result = ? Call(exec, R, « S »).
2. result 가 Object 도 null 도 아니면 TypeError.
3. result 반환.
? RequireInternalSlot(R, [[RegExpMatcher]]).
? RegExpBuiltinExec(R, S) 반환.

Note

호출 가능한 "exec" 프로퍼티가 없으면 내장 매칭 알고리즘을 사용한다. 이는 이전 버전과의 호환을 위해, 당시 대부분의 내장 알고리즘이 "exec" 의 동적 조회를 수행하지 않았던 동작을 유지한다.

22.2.7.2 RegExpBuiltinExec ( `R`, `S` )

The abstract operation RegExpBuiltinExec takes arguments R (초기화된 RegExp 인스턴스) and S (String) and returns Array exotic object 또는 null 을 포함하는 정상 완료 또는 throw completion. It performs the following steps when called:

length = S 길이.
lastIndex = ℝ(? ToLength(! Get(R, "lastIndex"))).
flags = R.[[OriginalFlags]].
flags 가 "g" 포함하면 global = true; 아니면 false.
flags 가 "y" 포함하면 sticky = true; 아니면 false.
flags 가 "d" 포함하면 hasIndices = true; 아니면 false.
global = false 이고 sticky = false 이면 lastIndex = 0.
matcher = R.[[RegExpMatcher]].
flags 가 "u" 또는 "v" 포함하면 fullUnicode = true; 아니면 false.
matchSucceeded = false.
fullUnicode = true 이면 input = StringToCodePoints(S); 아니면 input = S 코드 유닛 List.
NOTE: input 의 각 요소는 문자로 간주.
matchSucceeded = false 동안 반복
1. lastIndex > length 이면
  1. global 또는 sticky 가 true 이면
    1. ? Set(R, "lastIndex", +0_𝔽, true).
  2. null 반환.
2. inputIndex = S 의 lastIndex 위치에서 얻은 문자의 input 내 인덱스.
3. r = matcher(input, inputIndex).
4. r = failure 이면
  1. sticky = true 이면
    1. ? Set(R, "lastIndex", +0_𝔽, true).
    2. null 반환.
  2. lastIndex = AdvanceStringIndex(S, lastIndex, fullUnicode).
5. Else
  1. Assert: r 는 MatchState.
  2. matchSucceeded = true.
e = r.[[EndIndex]].
fullUnicode = true 이면 e = GetStringIndex(S, e).
global 또는 sticky = true 이면
1. ? Set(R, "lastIndex", 𝔽(e), true).
n = r.[[Captures]] 요소 수.
Assert: n = R.[[RegExpRecord]].[[CapturingGroupsCount]].
Assert: n < 2³² - 1.
A = ! ArrayCreate(n + 1).
Assert: A."length" = n + 1.
! CreateDataPropertyOrThrow(A, "index", 𝔽(lastIndex)).
! CreateDataPropertyOrThrow(A, "input", S).
match = Match Record { [[StartIndex]]: lastIndex, [[EndIndex]]: e }.
indices = 새 빈 List.
groupNames = 새 빈 List.
indices 에 match 추가.
matchedSubstr = GetMatchString(S, match).
! CreateDataPropertyOrThrow(A, "0", matchedSubstr).
R 가 GroupName 을 하나라도 포함하면
1. groups = OrdinaryObjectCreate(null).
2. hasGroups = true.
Else
1. groups = undefined.
2. hasGroups = false.
! CreateDataPropertyOrThrow(A, "groups", groups).
matchedGroupNames = 새 빈 List.
1 ≤ i ≤ n 에 대해 상승 순서 반복
1. captureI = r.[[Captures]] 의 i 번째 요소.
2. captureI = undefined 이면
  1. capturedValue = undefined.
  2. indices 에 undefined 추가.
3. Else
  1. captureStart = captureI.[[StartIndex]].
  2. captureEnd = captureI.[[EndIndex]].
  3. fullUnicode = true 이면
    1. captureStart = GetStringIndex(S, captureStart).
    2. captureEnd = GetStringIndex(S, captureEnd).
  4. capture = Match Record { [[StartIndex]]: captureStart, [[EndIndex]]: captureEnd }.
  5. capturedValue = GetMatchString(S, capture).
  6. indices 에 capture 추가.
4. ! CreateDataPropertyOrThrow(A, ! ToString(𝔽(i)), capturedValue).
5. i 번째 캡처가 GroupName 으로 정의된 경우
  1. s = 그 GroupName 의 CapturingGroupName.
  2. matchedGroupNames 가 s 포함하면
    1. Assert: capturedValue 는 undefined.
    2. groupNames 에 undefined 추가.
  3. Else
    1. capturedValue ≠ undefined 이면 matchedGroupNames 에 s 추가.
    2. ! CreateDataPropertyOrThrow(groups, s, capturedValue).
    3. groupNames 에 s 추가.
6. Else
  1. groupNames 에 undefined 추가.
hasIndices = true 이면
1. indicesArray = MakeMatchIndicesIndexPairArray(S, indices, groupNames, hasGroups).
2. ! CreateDataPropertyOrThrow(A, "indices", indicesArray).
A 반환.

22.2.7.3 AdvanceStringIndex ( `S`, `index`, `unicode` )

The abstract operation AdvanceStringIndex takes arguments S (String), index (음이 아닌 정수), and unicode (Boolean) and returns 정수. It performs the following steps when called:

Assert: index ≤ 2⁵³ - 1.
unicode = false 이면 index + 1 반환.
length = S 길이.
index + 1 ≥ length 이면 index + 1 반환.
cp = CodePointAt(S, index).
index + cp.[[CodeUnitCount]] 반환.

22.2.7.4 GetStringIndex ( `S`, `codePointIndex` )

The abstract operation GetStringIndex takes arguments S (String) and codePointIndex (음이 아닌 정수) and returns 음이 아닌 정수. S 를 UTF-16 인코딩된 코드 포인트 시퀀스로 해석하고 codePointIndex 번째 코드 포인트에 대응하는 코드 유닛 인덱스를 반환한다. 존재하지 않으면 S 길이를 반환한다. It performs the following steps when called:

S 가 빈 String 이면 0 반환.
len = S 길이.
codeUnitCount = 0.
codePointCount = 0.
codeUnitCount < len 동안
1. codePointCount = codePointIndex 이면 codeUnitCount 반환.
2. cp = CodePointAt(S, codeUnitCount).
3. codeUnitCount += cp.[[CodeUnitCount]].
4. codePointCount += 1.
len 반환.

22.2.7.5 매치 레코드 (Match Records)

Match Record 는 정규 표현식 매치 또는 캡처의 시작·끝 인덱스를 캡슐화하는 Record 값이다.

Match Record 는 Table 70 의 필드를 가진다.

Table 70: Match Record 필드

Field Name	Value	Meaning
`[[StartIndex]]`	음이 아닌 정수	문자열 시작으로부터 (포함) 매치가 시작되는 코드 유닛 수.
`[[EndIndex]]`	`[[StartIndex]]` 이상 정수	문자열 시작으로부터 (배타) 매치가 끝나는 코드 유닛 수.

22.2.7.6 GetMatchString ( `S`, `match` )

The abstract operation GetMatchString takes arguments S (String) and match (Match Record) and returns String. It performs the following steps when called:

Assert: match.[[StartIndex]] ≤ match.[[EndIndex]] ≤ S 길이.
S 의 [match.[[StartIndex]], match.[[EndIndex]]) 부분 문자열 반환.

22.2.7.7 GetMatchIndexPair ( `S`, `match` )

The abstract operation GetMatchIndexPair takes arguments S (String) and match (Match Record) and returns Array. It performs the following steps when called:

Assert: match.[[StartIndex]] ≤ match.[[EndIndex]] ≤ S 길이.
CreateArrayFromList(« 𝔽(match.[[StartIndex]]), 𝔽(match.[[EndIndex]]) ») 반환.

22.2.7.8 MakeMatchIndicesIndexPairArray ( `S`, `indices`, `groupNames`, `hasGroups` )

The abstract operation MakeMatchIndicesIndexPairArray takes arguments S (String), indices (Match Record 또는 undefined 의 List), groupNames (String 또는 undefined 의 List), and hasGroups (Boolean) and returns Array. It performs the following steps when called:

n = indices 요소 수.
Assert: n < 2³² - 1.
Assert: groupNames 는 n - 1 요소.
NOTE: groupNames 는 indices[1] 부터 정렬 대응.
A = ! ArrayCreate(n).
hasGroups = true 이면 groups = OrdinaryObjectCreate(null); 아니면 undefined.
! CreateDataPropertyOrThrow(A, "groups", groups).
0 ≤ i < n 에 대해 상승 순서 반복
1. matchIndices = indices[i].
2. matchIndices ≠ undefined 이면 matchIndexPair = GetMatchIndexPair(S, matchIndices); 아니면 undefined.
3. ! CreateDataPropertyOrThrow(A, ! ToString(𝔽(i)), matchIndexPair).
4. i > 0 이면
  1. s = groupNames[i - 1].
  2. s ≠ undefined 이면
    1. Assert: groups ≠ undefined.
    2. ! CreateDataPropertyOrThrow(groups, s, matchIndexPair).
A 반환.

22.2.8 RegExp 인스턴스의 프로퍼티 (Properties of RegExp Instances)

RegExp 인스턴스는 RegExp 프로토타입으로부터 프로퍼티를 상속하는 일반 객체다. [[OriginalSource]], [[OriginalFlags]], [[RegExpRecord]], [[RegExpMatcher]] 내부 슬롯을 가지며 [[RegExpMatcher]] 값은 해당 RegExp 객체 Pattern 의 추상 클로저 표현이다.

Note

ECMAScript 2015 이전에는 RegExp 인스턴스가 자체 데이터 프로퍼티 "source", "global", "ignoreCase", "multiline" 를 가진다고 명세되었으나, 이제 RegExp.prototype 의 접근자 프로퍼티로 정의된다.

RegExp 인스턴스는 다음 프로퍼티도 가진다:

22.2.8.1 lastIndex

"lastIndex" 프로퍼티 값은 다음 매치를 시작할 String 인덱스를 지정한다. 사용 시 정수 Number 로 강제된다( 22.2.7.2 참조 ). 이 프로퍼티 특성은 { [[Writable]]: true, [[Enumerable]]: false, [[Configurable]]: false } 이다.

22.2.9 RegExp 문자열 이터레이터 객체 (RegExp String Iterator Objects)

RegExp String Iterator 는 특정 RegExp 인스턴스 객체를 사용하여 특정 String 인스턴스 객체 위를 순회(iteration)하는 과정을 표현하는 객체이다. RegExp String Iterator 객체에 대한 이름 있는 생성자는 없으며, 대신 RegExp 인스턴스 객체의 특정 메서드를 호출할 때 생성된다.

22.2.9.1 CreateRegExpStringIterator ( `R`, `S`, `global`, `fullUnicode` )

The abstract operation CreateRegExpStringIterator takes arguments R (Object), S (String), global (Boolean), and fullUnicode (Boolean) and returns Object. It performs the following steps when called:

iterator 를 OrdinaryObjectCreate(%RegExpStringIteratorPrototype%, « [[IteratingRegExp]], [[IteratedString]], [[Global]], [[Unicode]], [[Done]] ») 로 둔다.
iterator.[[IteratingRegExp]] = R.
iterator.[[IteratedString]] = S.
iterator.[[Global]] = global.
iterator.[[Unicode]] = fullUnicode.
iterator.[[Done]] = false.
iterator 를 반환한다.

22.2.9.2 %RegExpStringIteratorPrototype% 객체 (The %RegExpStringIteratorPrototype% Object)

%RegExpStringIteratorPrototype% 객체는 다음을 만족한다:

모든 RegExp String Iterator 객체들이 상속하는 프로퍼티들을 가진다.
일반(ordinary) 객체이다.
값이 %Iterator.prototype% 인 [[Prototype]] 내부 슬롯을 가진다.
다음 프로퍼티들을 가진다:

22.2.9.2.1 %RegExpStringIteratorPrototype%.next ( )

O 를 this 값으로 둔다.
O 가 Object 가 아니면 TypeError 예외를 throw 한다.
O 가 RegExp String Iterator Object Instance 의 모든 내부 슬롯( 22.2.9.3 참조 )을 갖지 않으면 TypeError 예외를 throw 한다.
O.[[Done]] 가 true 이면
1. CreateIteratorResultObject(undefined, true) 를 반환한다.
R = O.[[IteratingRegExp]].
S = O.[[IteratedString]].
global = O.[[Global]].
fullUnicode = O.[[Unicode]].
match = ? RegExpExec(R, S).
match 가 null 이면
1. O.[[Done]] = true 로 설정한다.
2. CreateIteratorResultObject(undefined, true) 를 반환한다.
global 이 false 이면
1. O.[[Done]] = true 로 설정한다.
2. CreateIteratorResultObject(match, false) 를 반환한다.
matchStr = ? ToString(? Get(match, "0")).
matchStr 가 빈 String 이면
1. thisIndex = ℝ(? ToLength(? Get(R, "lastIndex"))).
2. nextIndex = AdvanceStringIndex(S, thisIndex, fullUnicode).
3. ? Set(R, "lastIndex", 𝔽(nextIndex), true) 를 수행한다.
CreateIteratorResultObject(match, false) 를 반환한다.

22.2.9.2.2 %RegExpStringIteratorPrototype% [ %Symbol.toStringTag% ]

%Symbol.toStringTag% 프로퍼티의 초기 값은 String 값 "RegExp String Iterator" 이다.

이 프로퍼티는 { [[Writable]]: false, [[Enumerable]]: false, [[Configurable]]: true } 특성을 가진다.

22.2.9.3 RegExp 문자열 이터레이터 인스턴스의 프로퍼티 (Properties of RegExp String Iterator Instances)

RegExp String Iterator 인스턴스는 %RegExpStringIteratorPrototype% 본질(intrinsic) 객체로부터 프로퍼티를 상속하는 일반 객체이다. 이러한 인스턴스는 Table 71 에 열거된 내부 슬롯들과 함께 초기 생성된다.

Table 71: RegExp String Iterator 인스턴스의 내부 슬롯 (Internal Slots of RegExp String Iterator Instances)

Internal Slot	Type	Description
`[[IteratingRegExp]]`	an Object	순회에 사용되는 정규 표현식. IsRegExp(`[[IteratingRegExp]]`) 는 초기에는 true 이다.
`[[IteratedString]]`	a String	순회 대상이 되는 String 값.
`[[Global]]`	a Boolean	`[[IteratingRegExp]]` 가 global 인지 여부.
`[[Unicode]]`	a Boolean	`[[IteratingRegExp]]` 가 Unicode 모드인지 여부.
`[[Done]]`	a Boolean	순회가 완료되었는지 여부.

22 텍스트 처리

22.1 String 객체

22.1.1 String 생성자

22.1.1.1 String ( value )

22.1.2 String 생성자의 프로퍼티

22.1.2.1 String.fromCharCode ( ...codeUnits )

22.1.2.2 String.fromCodePoint ( ...codePoints )

22.1.2.3 String.prototype

22.1.2.4 String.raw ( template, ...substitutions )

22.1.3 String 프로토타입 객체의 프로퍼티

22.1.3.1 String.prototype.at ( index )

22.1.3.2 String.prototype.charAt ( pos )

22.1.3.3 String.prototype.charCodeAt ( pos )

22.1.3.4 String.prototype.codePointAt ( pos )

22.1.3.5 String.prototype.concat ( ...args )

22.1.3.6 String.prototype.constructor

22.1.3.7 String.prototype.endsWith ( searchString [ , endPosition ] )

22.1.3.8 String.prototype.includes ( searchString [ , position ] )

22.1.3.9 String.prototype.indexOf ( searchString [ , position ] )

22.1.3.10 String.prototype.isWellFormed ( )

22.1.3.11 String.prototype.lastIndexOf ( searchString [ , position ] )

22.1.3.12 String.prototype.localeCompare ( that [ , reserved1 [ , reserved2 ] ] )

22.1.3.13 String.prototype.match ( regexp )

22.1.3.14 String.prototype.matchAll ( regexp )

22.1.3.15 String.prototype.normalize ( [ form ] )

22.1.3.16 String.prototype.padEnd ( maxLength [ , fillString ] )

22.1.3.17 String.prototype.padStart ( maxLength [ , fillString ] )

22.1.3.17.1 StringPaddingBuiltinsImpl ( O, maxLength, fillString, placement )

22.1.3.17.2 StringPad ( S, maxLength, fillString, placement )

22.1.3.17.3 ToZeroPaddedDecimalString ( n, minLength )

22.1.3.18 String.prototype.repeat ( count )

22.1.3.19 String.prototype.replace ( searchValue, replaceValue )

22.1.3.19.1 GetSubstitution ( matched, str, position, captures, namedCaptures, replacementTemplate )

22.1.3.20 String.prototype.replaceAll ( searchValue, replaceValue )

22.1.3.21 String.prototype.search ( regexp )

22.1.3.22 String.prototype.slice ( start, end )

22.1.3.23 String.prototype.split ( separator, limit )

22.1.3.24 String.prototype.startsWith ( searchString [ , position ] )

22.1.3.25 String.prototype.substring ( start, end )

22.1.3.26 String.prototype.toLocaleLowerCase ( [ reserved1 [ , reserved2 ] ] )

22.1.3.27 String.prototype.toLocaleUpperCase ( [ reserved1 [ , reserved2 ] ] )

22.1.3.28 String.prototype.toLowerCase ( )

22.1.3.29 String.prototype.toString ( )

22.1.3.30 String.prototype.toUpperCase ( )

22.1.3.31 String.prototype.toWellFormed ( )

22.1.3.32 String.prototype.trim ( )

22.1.3.32.1 TrimString ( string, where )

22.1.3.33 String.prototype.trimEnd ( )

22.1.3.34 String.prototype.trimStart ( )

22.1.3.35 String.prototype.valueOf ( )

22.1.3.35.1 ThisStringValue ( value )

22.1.3.36 String.prototype [ %Symbol.iterator% ] ( )

22.1.4 String 인스턴스의 프로퍼티

22.1.4.1 length

22.1.5 String 이터레이터 객체

22.1.5.1 %StringIteratorPrototype% 객체

22.1.5.1.1 %StringIteratorPrototype%.next ( )

22.1.5.1.2 %StringIteratorPrototype% [ %Symbol.toStringTag% ]

22.2 RegExp (정규 표현식) 객체

22.2.1 패턴 (Patterns)

구문 (Syntax)

22.2.1.1 정적 의미론: 조기 오류 (Early Errors)

22.2.1.2 정적 의미론: CountLeftCapturingParensWithin ( node: 구문 노드, ): 음이 아닌 정수

22.2.1.3 정적 의미론: CountLeftCapturingParensBefore ( node: 구문 노드, ): 음이 아닌 정수

22.2.1.4 정적 의미론: MightBothParticipate ( x: 구문 노드, y: 구문 노드, ): 불리언

22.2.1.5 정적 의미론: CapturingGroupNumber : 양의 정수

22.2.1.6 정적 의미론: IsCharacterClass : 불리언

22.2.1.7 정적 의미론: CharacterValue : 음이 아닌 정수

22.2.1.8 정적 의미론: MayContainStrings : 불리언

22.2.1.9 정적 의미론: GroupSpecifiersThatMatch ( thisGroupName: GroupName 구문 노드, ): GroupSpecifier 구문 노드들의 리스트

22.2.1.10 정적 의미론: CapturingGroupName : 문자열

22.2.1.11 정적 의미론: RegExpIdentifierCodePoints : 코드 포인트들의 리스트

22.2.1.12 정적 의미론: RegExpIdentifierCodePoint : 코드 포인트

22.2.2 패턴 의미론 (Pattern Semantics)

22.2.2.1 표기법 (Notation)

22.2.2.1.1 RegExp 레코드 (RegExp Records)

22.2.2.2 런타임 의미론: CompilePattern : 문자 리스트와 음이 아닌 정수를 받아 MatchState 또는 failure 를 반환하는 추상 클로저

22.2.2.3 런타임 의미론: CompileSubpattern : Matcher

22.2.2.3.1 RepeatMatcher ( m, min, max, greedy, x, c, parenIndex, parenCount )

22.2.2.3.2 EmptyMatcher ( )

22.1.1.1 String ( `value` )

22.1.2.1 String.fromCharCode ( ...`codeUnits` )

22.1.2.2 String.fromCodePoint ( ...`codePoints` )

22.1.2.4 String.raw ( `template`, ...`substitutions` )

22.1.3.1 String.prototype.at ( `index` )

22.1.3.2 String.prototype.charAt ( `pos` )

22.1.3.3 String.prototype.charCodeAt ( `pos` )

22.1.3.4 String.prototype.codePointAt ( `pos` )

22.1.3.5 String.prototype.concat ( ...`args` )

22.1.3.7 String.prototype.endsWith ( `searchString` [ , `endPosition` ] )

22.1.3.8 String.prototype.includes ( `searchString` [ , `position` ] )

22.1.3.9 String.prototype.indexOf ( `searchString` [ , `position` ] )

22.1.3.11 String.prototype.lastIndexOf ( `searchString` [ , `position` ] )

22.1.3.12 String.prototype.localeCompare ( `that` [ , `reserved1` [ , `reserved2` ] ] )

22.1.3.13 String.prototype.match ( `regexp` )

22.1.3.14 String.prototype.matchAll ( `regexp` )

22.1.3.15 String.prototype.normalize ( [ `form` ] )

22.1.3.16 String.prototype.padEnd ( `maxLength` [ , `fillString` ] )

22.1.3.17 String.prototype.padStart ( `maxLength` [ , `fillString` ] )

22.1.3.17.1 StringPaddingBuiltinsImpl ( `O`, `maxLength`, `fillString`, `placement` )

22.1.3.17.2 StringPad ( `S`, `maxLength`, `fillString`, `placement` )

22.1.3.17.3 ToZeroPaddedDecimalString ( `n`, `minLength` )

22.1.3.18 String.prototype.repeat ( `count` )

22.1.3.19 String.prototype.replace ( `searchValue`, `replaceValue` )

22.1.3.19.1 GetSubstitution ( `matched`, `str`, `position`, `captures`, `namedCaptures`, `replacementTemplate` )

22.1.3.20 String.prototype.replaceAll ( `searchValue`, `replaceValue` )

22.1.3.21 String.prototype.search ( `regexp` )

22.1.3.22 String.prototype.slice ( `start`, `end` )

22.1.3.23 String.prototype.split ( `separator`, `limit` )

22.1.3.24 String.prototype.startsWith ( `searchString` [ , `position` ] )

22.1.3.25 String.prototype.substring ( `start`, `end` )

22.1.3.26 String.prototype.toLocaleLowerCase ( [ `reserved1` [ , `reserved2` ] ] )

22.1.3.27 String.prototype.toLocaleUpperCase ( [ `reserved1` [ , `reserved2` ] ] )

22.1.3.32.1 TrimString ( `string`, `where` )

22.1.3.35.1 ThisStringValue ( `value` )

22.2.1.2 정적 의미론: CountLeftCapturingParensWithin ( `node`: 구문 노드, ): 음이 아닌 정수

22.2.1.3 정적 의미론: CountLeftCapturingParensBefore ( `node`: 구문 노드, ): 음이 아닌 정수

22.2.1.4 정적 의미론: MightBothParticipate ( `x`: 구문 노드, `y`: 구문 노드, ): 불리언

22.2.1.9 정적 의미론: GroupSpecifiersThatMatch ( `thisGroupName`: GroupName 구문 노드, ): GroupSpecifier 구문 노드들의 리스트

22.2.2.3.1 RepeatMatcher ( `m`, `min`, `max`, `greedy`, `x`, `c`, `parenIndex`, `parenCount` )

22.2.2.3.3 MatchTwoAlternatives ( `m1`, `m2` )

22.2.2.3.4 MatchSequence ( `m1`, `m2`, `direction` )

22.2.2.4.1 IsWordChar ( `rer`, `Input`, `e` )

22.2.2.7.1 CharacterSetMatcher ( `rer`, `A`, `invert`, `direction` )

22.2.2.7.2 BackreferenceMatcher ( `rer`, `ns`, `direction` )

22.2.2.7.3 Canonicalize ( `rer`, `ch` )

22.2.2.7.4 UpdateModifiers ( `rer`, `add`, `remove` )

22.2.2.9.1 CharacterRange ( `A`, `B` )

22.2.2.9.2 HasEitherUnicodeFlag ( `rer` )

22.2.2.9.3 WordCharacters ( `rer` )

22.2.2.9.4 AllCharacters ( `rer` )

22.2.2.9.5 MaybeSimpleCaseFolding ( `rer`, `A` )

22.2.2.9.6 CharacterComplement ( `rer`, `S` )

22.2.2.9.7 UnicodeMatchProperty ( `rer`, `p` )

22.2.2.9.8 UnicodeMatchPropertyValue ( `p`, `v` )

22.2.3.1 RegExpCreate ( `P`, `F` )

22.2.3.2 RegExpAlloc ( `newTarget` )

22.2.3.3 RegExpInitialize ( `obj`, `pattern`, `flags` )

22.2.3.4 정적 의미론: ParsePattern ( `patternText`: Unicode 코드 포인트 시퀀스, `u`: Boolean, `v`: Boolean, ): Parse Node 또는 비어 있지 않은 SyntaxError 객체 List

22.2.4.1 RegExp ( `pattern`, `flags` )

22.2.5.1 RegExp.escape ( `S` )

22.2.5.1.1 EncodeForRegExpEscape ( `cp` )

22.2.6.2 RegExp.prototype.exec ( `string` )

22.2.6.4.1 RegExpHasFlag ( `R`, `codeUnit` )

22.2.6.8 RegExp.prototype [ %Symbol.match% ] ( `string` )

22.2.6.9 RegExp.prototype [ %Symbol.matchAll% ] ( `string` )

22.2.6.11 RegExp.prototype [ %Symbol.replace% ] ( `string`, `replaceValue` )

22.2.6.12 RegExp.prototype [ %Symbol.search% ] ( `string` )

22.2.6.13.1 EscapeRegExpPattern ( `P`, `F` )

22.2.6.14 RegExp.prototype [ %Symbol.split% ] ( `string`, `limit` )

22.2.6.16 RegExp.prototype.test ( `S` )

22.2.7.1 RegExpExec ( `R`, `S` )

22.2.7.2 RegExpBuiltinExec ( `R`, `S` )

22.2.7.3 AdvanceStringIndex ( `S`, `index`, `unicode` )

22.2.7.4 GetStringIndex ( `S`, `codePointIndex` )

22.2.7.6 GetMatchString ( `S`, `match` )

22.2.7.7 GetMatchIndexPair ( `S`, `match` )

22.2.7.8 MakeMatchIndicesIndexPairArray ( `S`, `indices`, `groupNames`, `hasGroups` )

22.2.9.1 CreateRegExpStringIterator ( `R`, `S`, `global`, `fullUnicode` )