12 ECMAScript 언어: 어휘 문법( Lexical Grammar )

InputElementRegExpOrTemplateTail

InputElementHashbangOrRegExp

TemplateSubstitutionTail

InputElementTemplateTail

TemplateSubstitutionTail

12.1 유니코드 서식 제어 문자(Unicode Format-Control Characters)

유니코드 서식 제어 문자(즉, 유니코드 문자 데이터베이스 범주 “Cf” 에 속하는 LEFT-TO-RIGHT MARK, RIGHT-TO-LEFT MARK 등)는 상위 수준 프로토콜(예: 마크업 언어)이 없는 상황에서 텍스트 범위 서식을 제어하는 데 사용하는 제어 코드이다.

소스 텍스트에서 편집 및 표시를 용이하게 하기 위해 서식 제어 문자를 허용하는 것이 유용하다. 모든 서식 제어 문자는 주석 내부, 그리고 문자열 리터럴, 템플릿 리터럴, 정규 표현식 리터럴 내부에 사용할 수 있다.

U+FEFF (ZERO WIDTH NO-BREAK SPACE)는 주로 텍스트 시작 부분에서 해당 텍스트가 유니코드임을 표시하고 인코딩과 바이트 순서를 감지할 수 있도록 하는 서식 제어 문자이다. 이 목적을 위한 <ZWNBSP> 문자가 파일을 연결(concatenate)한 결과 등으로 텍스트 시작 이후에 나타나는 경우도 있다. ECMAScript 소스 텍스트에서 <ZWNBSP> 코드 포인트는 주석, 문자열 리터럴, 템플릿 리터럴, 정규 표현식 리터럴 밖에서는 공백 문자(12.2 참조)로 취급된다.

12.2 공백(White Space)

공백 코드 포인트는 소스 텍스트 가독성을 높이고 토큰(분할 불가능한 어휘 단위)들을 분리하기 위해 사용되며, 그 외에는 의미가 없다. 공백 코드 포인트는 임의의 두 토큰 사이 및 입력 시작과 끝에 나타날 수 있다. 공백 코드 포인트는 StringLiteral, RegularExpressionLiteral, Template, TemplateSubstitutionTail 내부에 나타날 수 있으며 그 경우 리터럴 값의 일부를 구성하는 의미 있는(code point)로 간주된다. Comment 내부에도 나타날 수 있지만 그 밖의 다른 종류의 토큰 내부에는 나타날 수 없다.

ECMAScript 공백 코드 포인트는 Table 33에 나열되어 있다.

Table 33: White Space Code Points

Code Points	Name	Abbreviation
`U+0009`	CHARACTER TABULATION	<TAB>
`U+000B`	LINE TABULATION	<VT>
`U+000C`	FORM FEED (FF)	<FF>
`U+FEFF`	ZERO WIDTH NO-BREAK SPACE	<ZWNBSP>
any code point in general category “Space_Separator”		<USP>

Note 1

U+0020 (SPACE) 과 U+00A0 (NO-BREAK SPACE) 코드 포인트는 <USP> 에 속한다.

Note 2

Table 33 에 나열된 코드 포인트를 제외하고 ECMAScript WhiteSpace 는 “White_Space” 유니코드 속성을 가지지만 일반 범주 “Space_Separator”(“Zs”) 에 속하지 않는 모든 코드 포인트를 의도적으로 제외한다.

Syntax

WhiteSpace

<TAB>

<VT>

<FF>

<USP>

12.3 줄 종결자(Line Terminators)

공백 코드 포인트와 마찬가지로 줄 종결자 코드 포인트는 소스 텍스트 가독성을 높이고 토큰을 서로 분리한다. 그러나 공백 코드 포인트와 달리 줄 종결자는 구문 문법 동작에 일부 영향을 준다. 일반적으로 줄 종결자는 임의의 두 토큰 사이에 나타날 수 있지만, 구문 문법이 금지하는 몇몇 위치에는 나타날 수 없다. 줄 종결자는 자동 세미콜론 삽입 과정(12.10)에도 영향을 준다. 줄 종결자는 StringLiteral, Template, TemplateSubstitutionTail 을 제외한 어떤 토큰 내부에도 나타날 수 없다. <LF> 및 <CR> 줄 종결자는 LineContinuation 의 일부가 아닌 한 StringLiteral 토큰 내부에 나타날 수 없다.

줄 종결자는 MultiLineComment 내부에 나타날 수 있지만 SingleLineComment 내부에는 나타날 수 없다.

줄 종결자는 정규 표현식에서 \s 클래스가 매칭하는 공백 코드 포인트 집합에 포함된다.

ECMAScript 줄 종결자 코드 포인트는 Table 34에 나열되어 있다.

Table 34: Line Terminator Code Points

Code Point	Unicode Name	Abbreviation
`U+000A`	LINE FEED (LF)	<LF>
`U+000D`	CARRIAGE RETURN (CR)	<CR>
`U+2028`	LINE SEPARATOR	<LS>
`U+2029`	PARAGRAPH SEPARATOR	<PS>

Table 34 의 유니코드 코드 포인트만 줄 종결자로 취급된다. 다른 개행(new line) 또는 줄 분리(line breaking) 유니코드 코드 포인트는 줄 종결자로 취급되지 않지만 Table 33 에 명시된 요구를 만족하면 공백으로 취급된다. 시퀀스 <CR><LF> 는 일반적으로 하나의 줄 종결자로 사용된다. 행 번호 보고 목적으로는 단일 SourceCharacter 로 간주해야 한다.

Syntax

LineTerminator

<LF>

<CR>

<LS>

<PS>

LineTerminatorSequence

<LF>

<CR>

[lookahead ≠ <LF>]

<LS>

<PS>

<CR>

<LF>

12.4 주석(Comments)

주석은 단일 행 또는 다중 행일 수 있다. 다중 행 주석은 중첩될 수 없다.

단일 행 주석은 LineTerminator 코드 포인트를 제외한 임의의 유니코드 코드 포인트를 포함할 수 있고, 토큰은 항상 가능한 한 가장 길게 인식된다는 일반 규칙 때문에, 단일 행 주석은 // 마커로부터 그 줄 끝까지의 모든 코드 포인트로 구성된다. 단, 줄 끝의 LineTerminator 는 단일 행 주석의 일부로 간주되지 않으며 어휘 문법에 의해 별도로 인식되어 구문 문법을 위한 입력 요소 스트림의 일부가 된다. 이 점은 단일 행 주석의 존재 여부가 자동 세미콜론 삽입 과정 (12.10) 에 영향을 주지 않음을 의미하므로 매우 중요하다.

주석은 공백처럼 동작하며 폐기되지만, MultiLineComment 가 줄 종결자 코드 포인트를 포함하면 구문 문법이 파싱할 때 전체 주석이 LineTerminator 로 간주된다.

Syntax

opt

MultiLineNotAsteriskChar

MultiLineNotForwardSlashOrAsteriskChar

opt

PostAsteriskCommentChars

opt

PostAsteriskCommentChars

opt

PostAsteriskCommentChars

opt

MultiLineNotAsteriskChar

MultiLineNotForwardSlashOrAsteriskChar

but not *

but not one of / or *

SingleLineComment

opt

SingleLineCommentChar

opt

SingleLineCommentChar

but not LineTerminator

이 절의 다수 생성물은 B.1.1 절에서 대체 정의를 가진다.

12.5 Hashbang 주석(Hashbang Comments)

Hashbang 주석은 위치에 민감하며 다른 종류의 주석처럼 구문 문법 입력 요소 스트림에서 제거(discard)된다.

Syntax

HashbangComment

opt

12.6 토큰(Tokens)

Syntax

Note

DivPunctuator, RegularExpressionLiteral, RightBracePunctuator, TemplateSubstitutionTail 생성물은 CommonToken 생성물에 포함되지 않는 추가 토큰을 도출한다.

12.7 이름과 키워드(Names and Keywords)

IdentifierName 과 ReservedWord 는 Unicode Standard Annex #31 (Identifier and Pattern Syntax)에 규정된 기본 식별자 문법(Default Identifier Syntax)을 (소규모 수정과 함께) 따른 토큰이다. ReservedWord 는 IdentifierName 의 열거된 부분집합이다. 구문 문법은 Identifier 를 IdentifierName 이면서 ReservedWord 가 아닌 것으로 정의한다. 유니코드 식별자 문법은 유니코드 표준이 규정한 문자 속성에 기반한다. 최신 유니코드 표준 버전에 지정된 범주의 유니코드 코드 포인트는 모든 적합 ECMAScript 구현에서 그 범주에 속한 것으로 취급되어야 한다. ECMAScript 구현은 추후판 유니코드 표준에서 정의된 식별자 코드 포인트를 추가로 인식할 수 있다.

Note 1

본 표준은 특정 코드 포인트 추가를 지정한다: U+0024 (DOLLAR SIGN) 과 U+005F (LOW LINE)는 IdentifierName 내 어디서든 허용된다.

Syntax

IdentifierPartChar

one of

any Unicode code point with the Unicode property “ID_Start”

UnicodeIDContinue

any Unicode code point with the Unicode property “ID_Continue”

비단말 UnicodeEscapeSequence 의 정의는 12.9.4에 있다.

Note 2

비단말 IdentifierPart 는 UnicodeIDContinue 를 통해 _ 를 도출한다.

Note 3

유니코드 속성 “ID_Start” 와 “ID_Continue” 를 가진 코드 포인트 집합은 각각 “Other_ID_Start” 및 “Other_ID_Continue” 속성의 코드 포인트를 포함한다.

12.7.1 식별자 이름(Identifier Names)

유니코드 이스케이프 시퀀스는 IdentifierName 내에서 허용되며 해당 UnicodeEscapeSequence 의 IdentifierCodePoint 와 동일한 단일 유니코드 코드 포인트를 기여한다. UnicodeEscapeSequence 앞의 \ 는 어떠한 코드 포인트도 기여하지 않는다. UnicodeEscapeSequence 는 원래라면 무효인 코드 포인트를 IdentifierName 에 기여하는 데 사용할 수 없다. 즉 \ UnicodeEscapeSequence 시퀀스를 그것이 기여하는 SourceCharacter 로 치환해도 결과는 동일한 SourceCharacter 시퀀스를 가지는 여전히 유효한 IdentifierName 이어야 한다. 본 명세에서 IdentifierName 의 해석은 특정 코드 포인트가 이스케이프로 입력되었는지 여부와 관계없이 실제 코드 포인트에 기반한다.

유니코드 표준에 따라 정규적으로 등가(canonically equivalent)인 두 IdentifierName 은 각 UnicodeEscapeSequence 를 치환한 뒤 정확히 동일한 코드 포인트 시퀀스로 표현되지 않는 한 동일하지 않다.

12.7.1.1 정적 의미론: 조기 오류(Early Errors)

IdentifierStart

UnicodeEscapeSequence 의 IdentifierCodePoint 가 IdentifierStartChar 어휘 문법 생성물에 매칭되는 유니코드 코드 포인트가 아니면 Syntax Error.

UnicodeEscapeSequence 의 IdentifierCodePoint 가 IdentifierPartChar 어휘 문법 생성물에 매칭되는 유니코드 코드 포인트가 아니면 Syntax Error.

12.7.1.2 정적 의미론: IdentifierCodePoints : 코드 포인트 List

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS. It is defined piecewise over the following productions:

IdentifierName

IdentifierStart

cp 를 IdentifierStart 의 IdentifierCodePoint 로 둔다.
« cp » 반환.

IdentifierName

cps 를 파생된 IdentifierName 의 IdentifierCodePoints 로 둔다.
cp 를 IdentifierPart 의 IdentifierCodePoint 로 둔다.
cps 와 « cp » 의 리스트 연결을 반환한다.

12.7.1.3 정적 의미론: IdentifierCodePoint : 코드 포인트

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS. It is defined piecewise over the following productions:

IdentifierStart

IdentifierStartChar

IdentifierStartChar 가 매칭한 코드 포인트 반환.

IdentifierPartChar

IdentifierPartChar 가 매칭한 코드 포인트 반환.

Hex4Digits

Hex4Digits 의 MV 인 수치 값을 가진 코드 포인트 반환.

OptionalChainingPunctuator

CodePoint

}

CodePoint 의 MV 인 수치 값을 가진 코드 포인트 반환.

12.7.2 키워드와 예약어(Keywords and Reserved Words)

키워드(keyword) 는 IdentifierName 에 매칭되면서 구문적 용도를 가지는 토큰, 즉 어떤 구문 생성물에 고정폭(fixed width) 글꼴로 문자 그대로 등장하는 토큰이다. ECMAScript 키워드에는 if, while, async, await 등 다수가 포함된다.

예약어(reserved word) 는 식별자로 사용할 수 없는 IdentifierName 이다. 다수 키워드는 예약어이지만 아닌 것도 있으며 어떤 것은 특정 문맥에서만 예약된다. if, while 은 항상 예약어이다. await 는 async 함수 및 모듈 내부에서만 예약된다. async 는 예약되지 않으며 제한 없이 변수 이름이나 레이블로 사용할 수 있다.

이 명세는 문법 생성물과 조기 오류 규칙을 조합하여 어떤 이름이 유효한 식별자이고 어떤 것이 예약어인지 지정한다. 아래 ReservedWord 목록의 모든 토큰( await, yield 제외)은 무조건 예약된다. await, yield 예외는 매개변수화된 구문 생성물을 사용하는 13.1 에서 지정된다. 마지막으로 여러 조기 오류 규칙이 유효한 식별자 집합을 제한한다. 13.1.1, 14.3.1.1, 14.7.5.1, 15.7.1 참조. 요약하면 식별자 이름은 다섯 범주로 나뉜다:

Math, window, toString, _ 처럼 항상 식별자로 허용되고 키워드가 아닌 것;
await, yield 를 제외한 아래 ReservedWord 들처럼 결코 식별자로 허용되지 않는 것;
await, yield 처럼 문맥적으로 식별자로 허용되는 것;
strict 모드 코드에서 문맥적으로 식별자로 허용되지 않는 것: let, static, implements, interface, package, private, protected, public;
as, async, from, get, meta, of, set, target 처럼 항상 식별자로 허용되지만 특정 구문 생성물 안에서 Identifier 가 허용되지 않는 위치에 키워드로 나타나기도 하는 것.

조건부 키워드 또는 문맥적 키워드(contextual keyword) 라는 용어는 마지막 세 범주에 속해 어떤 문맥에서는 식별자, 다른 문맥에서는 키워드로 쓰일 수 있는 키워드를 가리킬 때 사용되기도 한다.

Syntax

ReservedWord

one of

await

break

case

catch

class

const

continue

debugger

default

delete

else

enum

export

extends

false

finally

for

function

import

instanceof

new

null

return

super

switch

this

throw

true

try

typeof

var

void

while

with

yield

Note 1

5.1.5 에 따라 문법의 키워드는 특정 SourceCharacter 요소들의 리터럴 시퀀스에 매칭된다. 키워드 내 코드 포인트는 \ UnicodeEscapeSequence 로 표현될 수 없다.

IdentifierName 은 \ UnicodeEscapeSequence 를 포함할 수 있지만 els\u{65} 처럼 작성하여 이름이 "else" 인 변수를 선언할 수는 없다. 13.1.1 의 조기 오류 규칙이 예약어와 StringValue 가 동일한 식별자를 배제한다.

Note 2

enum 은 현재 본 명세에서 키워드로 사용되지 않는다. 이는 향후 언어 확장을 위해 예약된 미래 예약어 (future reserved word)이다.

마찬가지로 implements, interface, package, private, protected, public 는 strict 모드 코드에서 미래 예약어이다.

Note 3

arguments 와 eval 은 키워드는 아니지만 strict 모드 코드에서 몇 가지 제약을 받는다. 13.1.1, 8.6.4, 15.2.1, 15.5.1, 15.6.1, 15.8.1 참조.

12.8 구두점 기호(Punctuators)

Syntax

Punctuator

OtherPunctuator

OptionalChainingPunctuator

[lookahead ∉ DecimalDigit]

OtherPunctuator

one of

{

(

)

[

]

...

;

===

!==

>>>

**=

<<=

>>=

>>>=

&&=

||=

??=

DivPunctuator

RightBracePunctuator

}

12.9 리터럴(Literals)

12.9.1 null 리터럴(Null Literals)

Syntax

NullLiteral

null

12.9.2 Boolean 리터럴(Boolean Literals)

Syntax

BooleanLiteral

true

false

12.9.3 숫자 리터럴(Numeric Literals)

Syntax

DecimalLiteral

[+Sep]

[+Sep]

[+Sep]

opt

[+Sep]

[Sep]

[?Sep]

[?Sep]

[?Sep]

[+Sep]

opt

ExponentPart

[+Sep]

opt

[+Sep]

ExponentPart

[+Sep]

opt

ExponentPart

[+Sep]

opt

opt

[+Sep]

[Sep]

[?Sep]

[+Sep]

[+Sep]

one of

one of

[Sep]

[?Sep]

one of

[Sep]

[?Sep]

[?Sep]

[?Sep]

[Sep]

[?Sep]

[?Sep]

[Sep]

[?Sep]

[+Sep]

[+Sep]

one of

[Sep]

[?Sep]

[?Sep]

[Sep]

[?Sep]

[+Sep]

[+Sep]

LegacyOctalLikeDecimalIntegerLiteral

NonOctalDigit

NonOctalDigit

LegacyOctalLikeDecimalIntegerLiteral

DecimalDigit

LegacyOctalLikeDecimalIntegerLiteral

one of

one of

[Sep]

[?Sep]

[?Sep]

[Sep]

[?Sep]

[+Sep]

[+Sep]

one of

NumericLiteral 바로 뒤 SourceCharacter 는 IdentifierStart 또는 DecimalDigit 이면 안 된다.

Note

예: 3in 은 오류이며 3 과 in 두 입력 요소가 아니다.

12.9.3.1 정적 의미론: 조기 오류(Early Errors)

IsStrict(this production) 이 true 이면 Syntax Error.

Note

비 strict 코드에서 이 문법은 레거시이다.

12.9.3.2 정적 의미론: MV

숫자 리터럴은 Number 타입 또는 BigInt 타입의 값을 나타낸다.

DecimalLiteral :: DecimalIntegerLiteral . DecimalDigits 의 MV 는 DecimalIntegerLiteral 의 MV + (DecimalDigits 의 MV × 10^-n) 이며 여기서 n 은 NumericLiteralSeparator 를 모두 제외한 DecimalDigits 의 코드 포인트 수.
DecimalLiteral :: DecimalIntegerLiteral . ExponentPart 의 MV 는 DecimalIntegerLiteral 의 MV × 10^e 이며 e 는 ExponentPart 의 MV.
DecimalLiteral :: DecimalIntegerLiteral . DecimalDigits ExponentPart 의 MV 는 (DecimalIntegerLiteral MV + (DecimalDigits MV × 10^-n)) × 10^e.
DecimalLiteral :: . DecimalDigits 의 MV 는 DecimalDigits MV × 10^-n.
DecimalLiteral :: . DecimalDigits ExponentPart 의 MV 는 DecimalDigits MV × 10^{e - n}.
DecimalLiteral :: DecimalIntegerLiteral ExponentPart 의 MV 는 DecimalIntegerLiteral MV × 10^e.
DecimalIntegerLiteral :: 0 의 MV 는 0.
DecimalIntegerLiteral :: NonZeroDigit NumericLiteralSeparatoropt DecimalDigits 의 MV 는 (NonZeroDigit MV × 10ⁿ) + DecimalDigits MV.
DecimalDigits :: DecimalDigits DecimalDigit 의 MV 는 (DecimalDigits MV × 10) + DecimalDigit MV.
DecimalDigits :: DecimalDigits NumericLiteralSeparator DecimalDigit 의 MV 도 위와 동일.
ExponentPart :: ExponentIndicator SignedInteger 의 MV 는 SignedInteger MV.
SignedInteger :: - DecimalDigits 의 MV 는 DecimalDigits MV 의 음수.
DecimalDigit :: 0 , HexDigit :: 0 , OctalDigit :: 0 , LegacyOctalEscapeSequence :: 0 , BinaryDigit :: 0 의 MV 는 0.
DecimalDigit :: 1 등 동일 패턴으로 1.
(2~9, a~f, A~F 등에 대한 MV 서술은 원문과 동일 규칙 반복 — 이하 각각 명시된 값.)
BinaryDigits :: BinaryDigits BinaryDigit MV = (이전 × 2) + 새 BinaryDigit MV.
NumericLiteralSeparator 가 있는 Binary / Octal / HexDigits 경우도 동일 방식.
Legacy / NonOctal / HexDigits 조합의 MV 는 표기된 진법에 따라 누적 계산(8 또는 10 또는 16 배) + 새 자리 MV.

12.9.3.3 정적 의미론: NumericValue : Number 또는 BigInt

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS. It is defined piecewise over the following productions:

DecimalLiteral

RoundMVResult(DecimalLiteral MV) 반환.

𝔽(NonDecimalIntegerLiteral MV) 반환.

𝔽(LegacyOctalIntegerLiteral MV) 반환.

NonDecimalIntegerLiteral MV 에 대한 BigInt 값 반환.

0_ℤ 반환.

NonZeroDigit MV 에 대한 BigInt 값 반환.

n = DecimalDigits 에서 NumericLiteralSeparator 제외한 코드 포인트 수.
mv = (NonZeroDigit MV × 10ⁿ) + DecimalDigits MV.
ℤ(mv) 반환.

12.9.4 문자열 리터럴(String Literals)

Note 1

문자열 리터럴은 작은따옴표 또는 큰따옴표로 둘러싸인 0개 이상의 유니코드 코드 포인트이다. 유니코드 코드 포인트는 이스케이프 시퀀스로도 표현할 수 있다. 닫는 따옴표 코드 포인트, U+005C (REVERSE SOLIDUS), U+000D (CR), U+000A (LF)를 제외한 모든 코드 포인트는 문자열 리터럴 안에 그대로 나타날 수 있다. 어떤 코드 포인트든 이스케이프 시퀀스 형태로 나타날 수 있다. 문자열 리터럴은 ECMAScript String 값을 평가 결과로 가진다. 이러한 String 값을 생성할 때 유니코드 코드 포인트는 11.1.1 에 정의된 대로 UTF-16 으로 인코딩된다. 기본 다국어 평면(BMP)의 코드 포인트는 하나의 코드 유닛으로, 그 밖의 코드 포인트는 두 코드 유닛으로 인코딩된다.

Syntax

StringLiteral

DoubleStringCharacters

opt

SingleStringCharacters

opt

DoubleStringCharacters

DoubleStringCharacter

DoubleStringCharacters

opt

SingleStringCharacters

SingleStringCharacter

SingleStringCharacters

opt

DoubleStringCharacter

but not one of " or \ or LineTerminator

<LS>

<PS>

LineContinuation

SingleStringCharacter

but not one of ' or \ or LineTerminator

<LS>

<PS>

LineContinuation

LineTerminatorSequence

LegacyOctalEscapeSequence

CharacterEscapeSequence

[lookahead ∉ DecimalDigit]

NonOctalDecimalEscapeSequence

HexEscapeSequence

CharacterEscapeSequence

SingleEscapeCharacter

NonEscapeCharacter

SingleEscapeCharacter

one of

NonEscapeCharacter

LegacyOctalEscapeSequence

but not one of EscapeCharacter or LineTerminator

EscapeCharacter

SingleEscapeCharacter

DecimalDigit

[lookahead ∈ { 8, 9 }]

NonZeroOctalDigit

[lookahead ∉ OctalDigit]

ZeroToThree

NonOctalDecimalEscapeSequence

[lookahead ∉ OctalDigit]

but not 0

one of

one of

one of

HexEscapeSequence

}

비단말 HexDigit 정의는 12.9.3, SourceCharacter 는 11.1 에 있다.

Note 2

<LF>, <CR> 은 LineContinuation 일부가 아닌 한 문자열 리터럴 내에 나타날 수 없다(빈 코드 포인트 시퀀스 생성). 문자열 값에 포함하려면 \n 또는 \u000A 같은 이스케이프를 사용해야 한다.

12.9.4.1 정적 의미론: 조기 오류(Early Errors)

LegacyOctalEscapeSequence

NonOctalDecimalEscapeSequence

IsStrict(this production) 이 true 이면 Syntax Error.

Note 1

비 strict 코드에서 이 문법은 레거시.

Note 2

문자열 리터럴이 뒤따르는 Use Strict Directive 를 통해 strict 모드가 되는 경우가 있으므로 구현은 위 규칙을 철저히 적용해야 한다. 예:

function invalid() { "\7"; "use strict"; }

12.9.4.2 정적 의미론: SV : String

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS.

문자열 리터럴은 String 타입 값에 해당한다. SV 는 문자열 리터럴의 여러 부분에 재귀 적용되어 String 값을 생성한다. 이 과정에서 문자열 리터럴 내 일부 유니코드 코드 포인트는 아래 또는 12.9.3 에 설명된 대로 수학적 값을 가진 것으로 해석된다.

StringLiteral :: " " 의 SV 는 빈 문자열.
StringLiteral :: ' ' 의 SV 는 빈 문자열.
DoubleStringCharacters :: DoubleStringCharacter DoubleStringCharacters 의 SV 는 두 SV 의 문자열 연결.
SingleStringCharacters :: SingleStringCharacter SingleStringCharacters 도 동일.
DoubleStringCharacter :: SourceCharacter but not one of " or \ or LineTerminator 의 SV 는 해당 SourceCharacter 코드 포인트 UTF16EncodeCodePoint 결과.
DoubleStringCharacter :: <LS> SV = 코드 유닛 0x2028.
DoubleStringCharacter :: <PS> SV = 코드 유닛 0x2029.
DoubleStringCharacter :: LineContinuation SV = 빈 문자열.
SingleStringCharacter 변형도 동일 규칙.
EscapeSequence :: 0 SV = 코드 유닛 0x0000.
CharacterEscapeSequence :: SingleEscapeCharacter SV 는 대응 표 (Table 35)의 코드 유닛.

Table 35: String Single Character Escape Sequences

Escape Sequence	Code Unit Value	Unicode Character Name	Symbol
`\b`	`0x0008`	BACKSPACE	<BS>
`\t`	`0x0009`	CHARACTER TABULATION	<HT>
`\n`	`0x000A`	LINE FEED (LF)	<LF>
`\v`	`0x000B`	LINE TABULATION	<VT>
`\f`	`0x000C`	FORM FEED (FF)	<FF>
`\r`	`0x000D`	CARRIAGE RETURN (CR)	<CR>
`\"`	`0x0022`	QUOTATION MARK	`"`
`\'`	`0x0027`	APOSTROPHE	`'`
`\\`	`0x005C`	REVERSE SOLIDUS	`\`

NonEscapeCharacter :: SourceCharacter but not one of EscapeCharacter or LineTerminator SV = UTF16EncodeCodePoint 결과.
EscapeSequence :: LegacyOctalEscapeSequence SV = LegacyOctalEscapeSequence MV 값 코드 유닛.
NonOctalDecimalEscapeSequence :: 8 SV = 코드 유닛 0x0038.
NonOctalDecimalEscapeSequence :: 9 SV = 코드 유닛 0x0039.
HexEscapeSequence :: x HexDigit HexDigit SV = MV 에 해당 코드 유닛.
Hex4Digits :: HexDigit HexDigit HexDigit HexDigit SV = MV 코드 유닛.
UnicodeEscapeSequence :: u{ CodePoint } SV = CodePoint MV UTF16EncodeCodePoint.
TemplateEscapeSequence :: 0 SV = 코드 유닛 0x0000.

12.9.4.3 정적 의미론: MV

LegacyOctalEscapeSequence :: ZeroToThree OctalDigit MV = (8 × ZeroToThree MV) + OctalDigit MV.
LegacyOctalEscapeSequence :: FourToSeven OctalDigit MV = (8 × FourToSeven MV) + OctalDigit MV.
LegacyOctalEscapeSequence :: ZeroToThree OctalDigit OctalDigit MV = (64 × ZeroToThree MV) + (8 × 첫 OctalDigit MV) + 둘째 OctalDigit MV.
ZeroToThree :: 0 MV = 0.
ZeroToThree :: 1 MV = 1. (2,3 동일 패턴)
FourToSeven :: 4 MV = 4. (5,6,7 동일 패턴)
HexEscapeSequence :: x HexDigit HexDigit MV = (16 × 첫 HexDigit MV) + 둘째 HexDigit MV.
Hex4Digits :: HexDigit HexDigit HexDigit HexDigit MV = (0x1000 × 첫) + (0x100 × 둘째) + (0x10 × 셋째) + 넷째.

12.9.5 정규 표현식 리터럴(Regular Expression Literals)

Note 1

정규 표현식 리터럴은 평가될 때마다 RegExp 객체 (22.2) 로 변환되는 입력 요소이다. 동일한 내용을 가진 두 리터럴이라도 서로 === 비교에서 같지 않다. 런타임에 new RegExp 또는 RegExp 생성자 호출 (22.2.4) 로도 생성할 수 있다.

아래 생성물은 정규 표현식 리터럴의 구문을 기술하며 입력 요소 스캐너가 리터럴 끝을 찾는 데 사용된다. RegularExpressionBody 와 RegularExpressionFlags 를 이루는 소스 텍스트는 이후 더 엄격한 ECMAScript 정규식 문법 (22.2.1) 으로 다시 파싱된다.

구현은 22.2.1 의 ECMAScript 정규식 문법을 확장할 수 있으나 아래 RegularExpressionBody, RegularExpressionFlags 생성물 및 그 종속 생성물은 확장할 수 없다.

Syntax

RegularExpressionFirstChar

RegularExpressionChars

[empty]

RegularExpressionChars

RegularExpressionChar

RegularExpressionFirstChar

but not one of * or \ or / or [

RegularExpressionClass

RegularExpressionChar

but not one of \ or / or [

RegularExpressionClass

RegularExpressionClassChars

but not LineTerminator

RegularExpressionClass

[

]

RegularExpressionClassChars

[empty]

RegularExpressionClassChars

RegularExpressionClassChar

but not one of ] or \

[empty]

IdentifierPartChar

Note 2

정규 표현식 리터럴은 비어 있을 수 없다. // 는 빈 정규식이 아니라 단일 행 주석 시작이다. 빈 정규식을 지정하려면 /(?:)/ 사용.

12.9.5.1 정적 의미론: BodyText : 소스 텍스트

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS. It is defined piecewise over the following productions:

RegularExpressionBody 로 인식된 소스 텍스트 반환.

12.9.5.2 정적 의미론: FlagText : 소스 텍스트

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS. It is defined piecewise over the following productions:

RegularExpressionFlags 로 인식된 소스 텍스트 반환.

12.9.6 템플릿 리터럴 어휘 구성 요소(Template Literal Lexical Components)

Syntax

Template

NoSubstitutionTemplate

TemplateHead

NoSubstitutionTemplate

TemplateCharacters

opt

TemplateHead

TemplateCharacters

opt

TemplateSubstitutionTail

}

opt

}

opt

opt

[lookahead ≠ {]

TemplateEscapeSequence

NotEscapeSequence

LineContinuation

LineTerminatorSequence

but not one of ` or \ or $ or LineTerminator

TemplateEscapeSequence

CharacterEscapeSequence

[lookahead ∉ DecimalDigit]

HexEscapeSequence

NotEscapeSequence

DecimalDigit

but not 0

[lookahead ∉ HexDigit]

[lookahead ∉ HexDigit]

[lookahead ≠ {]

[lookahead ∉ HexDigit]

[lookahead ∉ HexDigit]

[lookahead ∉ HexDigit]

{

[lookahead ∉ HexDigit]

{

NotCodePoint

[lookahead ∉ HexDigit]

{

CodePoint

[lookahead ∉ HexDigit]

[lookahead ≠ }]

NotCodePoint

HexDigits

[~Sep]

but only if the MV of HexDigits > 0x10FFFF

CodePoint

HexDigits

[~Sep]

but only if the MV of HexDigits ≤ 0x10FFFF

Note

TemplateSubstitutionTail 은 InputElementTemplateTail 대안 어휘 목표에서 사용된다.

12.9.6.1 정적 의미론: TV : String 또는 undefined

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS. 템플릿 리터럴 구성 요소는 TV 에 의해 String 타입 값으로 해석된다(TV 는 템플릿 객체의 인덱스된 값 목록—template values—구성에 사용). TV 에서는 이스케이프 시퀀스가 해당 유니코드 코드 포인트의 UTF-16 코드 유닛(들)로 치환된다.

NoSubstitutionTemplate :: ` ` TV = 빈 문자열.
TemplateHead :: ` ${ TV = 빈 문자열.
TemplateMiddle :: } ${ TV = 빈 문자열.
TemplateTail :: } ` TV = 빈 문자열.
TemplateCharacters :: TemplateCharacter TemplateCharacters TV 는 어느 하나라도 undefined 이면 undefined, 아니면 둘의 문자열 연결.
TemplateCharacter :: SourceCharacter but not one of ` or \ or $ or LineTerminator TV = UTF16EncodeCodePoint 결과.
TemplateCharacter :: $ TV = 코드 유닛 0x0024.
TemplateCharacter :: \ TemplateEscapeSequence TV = TemplateEscapeSequence SV.
TemplateCharacter :: \ NotEscapeSequence TV = undefined.
TemplateCharacter :: LineTerminatorSequence TV = LineTerminatorSequence TRV.
LineContinuation :: \ LineTerminatorSequence TV = 빈 문자열.

12.9.6.2 정적 의미론: TRV : String

The syntax-directed operation UNKNOWN takes UNPARSEABLE ARGUMENTS. 템플릿 리터럴 구성 요소는 TRV 에 의해 String 값으로 해석(템플릿 raw 값 구성). TRV 는 TV 와 유사하지만 이스케이프 시퀀스를 그대로(문자열 상 표기 그대로) 반영한다.

The TRV of NoSubstitutionTemplate :: ` ` is the empty String.
The TRV of TemplateHead :: ` ${ is the empty String.
The TRV of TemplateMiddle :: } ${ is the empty String.
The TRV of TemplateTail :: } ` is the empty String.
The TRV of TemplateCharacters :: TemplateCharacter TemplateCharacters is the string-concatenation of the TRV of TemplateCharacter and the TRV of TemplateCharacters.
The TRV of TemplateCharacter :: SourceCharacter but not one of ` or \ or $ or LineTerminator is the result of performing UTF16EncodeCodePoint on the code point matched by SourceCharacter.
The TRV of TemplateCharacter :: $ is the String value consisting of the code unit 0x0024 (DOLLAR SIGN).
The TRV of TemplateCharacter :: \ TemplateEscapeSequence is the string-concatenation of the code unit 0x005C (REVERSE SOLIDUS) and the TRV of TemplateEscapeSequence.
The TRV of TemplateCharacter :: \ NotEscapeSequence is the string-concatenation of the code unit 0x005C (REVERSE SOLIDUS) and the TRV of NotEscapeSequence.
The TRV of TemplateEscapeSequence :: 0 is the String value consisting of the code unit 0x0030 (DIGIT ZERO).
The TRV of NotEscapeSequence :: 0 DecimalDigit is the string-concatenation of the code unit 0x0030 (DIGIT ZERO) and the TRV of DecimalDigit.
The TRV of NotEscapeSequence :: x [lookahead ∉ HexDigit] is the String value consisting of the code unit 0x0078 (LATIN SMALL LETTER X).
The TRV of NotEscapeSequence :: x HexDigit [lookahead ∉ HexDigit] is the string-concatenation of the code unit 0x0078 (LATIN SMALL LETTER X) and the TRV of HexDigit.
The TRV of NotEscapeSequence :: u [lookahead ∉ HexDigit] [lookahead ≠ {] is the String value consisting of the code unit 0x0075 (LATIN SMALL LETTER U).
The TRV of NotEscapeSequence :: u HexDigit [lookahead ∉ HexDigit] is the string-concatenation of the code unit 0x0075 (LATIN SMALL LETTER U) and the TRV of HexDigit.
The TRV of NotEscapeSequence :: u HexDigit HexDigit [lookahead ∉ HexDigit] is the string-concatenation of the code unit 0x0075 (LATIN SMALL LETTER U), the TRV of the first HexDigit, and the TRV of the second HexDigit.
The TRV of NotEscapeSequence :: u HexDigit HexDigit HexDigit [lookahead ∉ HexDigit] is the string-concatenation of the code unit 0x0075 (LATIN SMALL LETTER U), the TRV of the first HexDigit, the TRV of the second HexDigit, and the TRV of the third HexDigit.
The TRV of NotEscapeSequence :: u { [lookahead ∉ HexDigit] is the string-concatenation of the code unit 0x0075 (LATIN SMALL LETTER U) and the code unit 0x007B (LEFT CURLY BRACKET).
The TRV of NotEscapeSequence :: u { NotCodePoint [lookahead ∉ HexDigit] is the string-concatenation of the code unit 0x0075 (LATIN SMALL LETTER U), the code unit 0x007B (LEFT CURLY BRACKET), and the TRV of NotCodePoint.
The TRV of NotEscapeSequence :: u { CodePoint [lookahead ∉ HexDigit] [lookahead ≠ }] is the string-concatenation of the code unit 0x0075 (LATIN SMALL LETTER U), the code unit 0x007B (LEFT CURLY BRACKET), and the TRV of CodePoint.
The TRV of DecimalDigit :: one of 0 1 2 3 4 5 6 7 8 9 is the result of performing UTF16EncodeCodePoint on the single code point matched by this production.
The TRV of CharacterEscapeSequence :: NonEscapeCharacter is the SV of NonEscapeCharacter.
The TRV of SingleEscapeCharacter :: one of ' " \ b f n r t v is the result of performing UTF16EncodeCodePoint on the single code point matched by this production.
The TRV of HexEscapeSequence :: x HexDigit HexDigit is the string-concatenation of the code unit 0x0078 (LATIN SMALL LETTER X), the TRV of the first HexDigit, and the TRV of the second HexDigit.
The TRV of UnicodeEscapeSequence :: u Hex4Digits is the string-concatenation of the code unit 0x0075 (LATIN SMALL LETTER U) and the TRV of Hex4Digits.
The TRV of UnicodeEscapeSequence :: u{ CodePoint } is the string-concatenation of the code unit 0x0075 (LATIN SMALL LETTER U), the code unit 0x007B (LEFT CURLY BRACKET), the TRV of CodePoint, and the code unit 0x007D (RIGHT CURLY BRACKET).
The TRV of Hex4Digits :: HexDigit HexDigit HexDigit HexDigit is the string-concatenation of the TRV of the first HexDigit, the TRV of the second HexDigit, the TRV of the third HexDigit, and the TRV of the fourth HexDigit.
The TRV of HexDigits :: HexDigits HexDigit is the string-concatenation of the TRV of HexDigits and the TRV of HexDigit.
The TRV of HexDigit :: one of 0 1 2 3 4 5 6 7 8 9 a b c d e f A B C D E F is the result of performing UTF16EncodeCodePoint on the single code point matched by this production.
The TRV of LineContinuation :: \ LineTerminatorSequence is the string-concatenation of the code unit 0x005C (REVERSE SOLIDUS) and the TRV of LineTerminatorSequence.
The TRV of LineTerminatorSequence :: <LF> is the String value consisting of the code unit 0x000A (LINE FEED).
The TRV of LineTerminatorSequence :: <CR> is the String value consisting of the code unit 0x000A (LINE FEED).
The TRV of LineTerminatorSequence :: <LS> is the String value consisting of the code unit 0x2028 (LINE SEPARATOR).
The TRV of LineTerminatorSequence :: <PS> is the String value consisting of the code unit 0x2029 (PARAGRAPH SEPARATOR).
The TRV of LineTerminatorSequence :: <CR> <LF> is the String value consisting of the code unit 0x000A (LINE FEED).

Note

TV 는 LineContinuation 의 코드 유닛을 제외하지만 TRV 는 포함한다. <CR><LF>, <CR> 줄 종결 시퀀스는 TV 와 TRV 모두에서 <LF> 로 정규화된다. <CR> 또는 <CR><LF> 를 원형(raw) 그대로 포함하려면 명시적 TemplateEscapeSequence 가 필요하다.

12.10 자동 세미콜론 삽입(Automatic Semicolon Insertion)

대부분의 ECMAScript 문과 선언은 세미콜론으로 종료되어야 하며, 그 세미콜론은 항상 소스에 명시적으로 쓸 수 있다. 편의를 위해 특정 상황에서는 세미콜론을 생략할 수 있으며, 이 상황에서는 세미콜론이 소스 코드 토큰 스트림에 자동 삽입된다고 기술한다.

12.10.1 자동 세미콜론 삽입 규칙(Rules of Automatic Semicolon Insertion)

다음 규칙에서 “토큰(token)” 은 12 에 설명된 현재 어휘 목표 심볼을 사용해 실제로 인식된 어휘 토큰을 뜻한다.

세 가지 기본 규칙은 다음과 같다:

왼쪽에서 오른쪽으로 파싱 중 문법의 어떤 생성물에도 허용되지 않는 토큰(“오류 토큰(offending token)”)을 만났을 때, 아래 조건 중 하나 이상이 참이면 그 토큰 앞에 세미콜론을 자동 삽입한다:
- 오류 토큰이 이전 토큰과 하나 이상의 LineTerminator 로 분리되어 있음.
- 오류 토큰이 } 임.
- 이전 토큰이 ) 이고 삽입된 세미콜론이 do-while 문(14.7.2) 종료 세미콜론으로 파싱될 수 있음.
입력 토큰 스트림 끝에 도달했고 파서가 목표 비단말의 단일 인스턴스로 파싱할 수 없다면 입력 스트림 끝에 세미콜론을 자동 삽입.
문법 생성물 중 제한 생성물(restricted production) 에 허용되는 토큰이지만 생성물 내 “[no LineTerminator here]” 주석 바로 뒤 등장할 수 있는 첫 토큰(제한 토큰)이며 그 제한 토큰이 이전 토큰과 하나 이상의 LineTerminator 로 분리되어 있다면 제한 토큰 앞에 세미콜론 자동 삽입.

그러나 추가 최우선 조건이 있다: 세미콜론이 빈 문(empty statement)으로 파싱되거나 for 문의 두 세미콜론 중 하나가 되게 하는 경우에는 자동 삽입되지 않는다 (14.7.4 참조).

Note

문법 내 유일한 제한 생성물은 다음과 같다:

UpdateExpression

[Yield, Await]

LeftHandSideExpression

[?Yield, ?Await]

[no LineTerminator here]

LeftHandSideExpression

[?Yield, ?Await]

[no LineTerminator here]

ContinueStatement

[Yield, Await]

continue

;

continue

[no LineTerminator here]

LabelIdentifier

[?Yield, ?Await]

;

BreakStatement

[Yield, Await]

break

;

break

[no LineTerminator here]

LabelIdentifier

[?Yield, ?Await]

;

ReturnStatement

[Yield, Await]

return

;

return

[no LineTerminator here]

Expression

[+In, ?Yield, ?Await]

;

ThrowStatement

[Yield, Await]

throw

[no LineTerminator here]

Expression

[+In, ?Yield, ?Await]

;

YieldExpression

[In, Await]

yield

[no LineTerminator here]

AssignmentExpression

[?In, +Yield, ?Await]

yield

[no LineTerminator here]

AssignmentExpression

[?In, +Yield, ?Await]

ArrowFunction

[In, Yield, Await]

ArrowParameters

[?Yield, ?Await]

[no LineTerminator here]

ConciseBody

[?In]

AsyncFunctionDeclaration

[Yield, Await, Default]

async

[no LineTerminator here]

function

[?Yield, ?Await]

(

[~Yield, +Await]

)

{

}

[+Default]

async

[no LineTerminator here]

function

(

[~Yield, +Await]

)

{

}

AsyncFunctionExpression

async

[no LineTerminator here]

function

[~Yield, +Await]

opt

(

[~Yield, +Await]

)

{

}

AsyncMethod

[Yield, Await]

async

[no LineTerminator here]

ClassElementName

[?Yield, ?Await]

(

UniqueFormalParameters

[~Yield, +Await]

)

{

AsyncGeneratorDeclaration

}

[Yield, Await, Default]

async

[no LineTerminator here]

function

[?Yield, ?Await]

(

[+Yield, +Await]

)

{

}

[+Default]

async

[no LineTerminator here]

function

(

[+Yield, +Await]

)

{

}

AsyncGeneratorExpression

async

[no LineTerminator here]

function

[+Yield, +Await]

opt

(

[+Yield, +Await]

)

{

}

AsyncGeneratorMethod

[Yield, Await]

async

[no LineTerminator here]

ClassElementName

[?Yield, ?Await]

(

UniqueFormalParameters

[+Yield, +Await]

)

{