字符类 | JavaScript

在此页

Types
范例
规范
浏览器兼容性
另请参阅
相关话题

Character classes distinguish kinds of characters such as, for example, distinguishing between letters and digits.

Types

The following table is also duplicated on this cheatsheet . Do not forget to edit it as well, thanks!

Characters	含义
`.`	Has one of the following meanings: Matches any single character except line terminators: `\n` , `\r` , `\u2028` or `\u2029` . For example, `/.y/` matches "my" and "ay", but not "yes", in "yes make my day". Inside a character set, the dot loses its special meaning and matches a literal dot. 注意， `m` multiline flag doesn't change the dot behavior. So to match a pattern across multiple lines, the character set `[^]` can be used — it will match any character including newlines. ES2018 added the `s` "dotAll" flag, which allows the dot to also match line terminators.
`\d`	Matches any digit (Arabic numeral). Equivalent to `[0-9]` 。例如， `/\d/` or `/[0-9]/` matches "2" in "B2 is the suite number".
`\D`	Matches any character that is not a digit (Arabic numeral). Equivalent to `[^0-9]` 。例如， `/\D/` or `/[^0-9]/` matches "B" in "B2 is the suite number".
`\w`	Matches any alphanumeric character from the basic Latin alphabet, including the underscore. Equivalent to `[A-Za-z0-9_]` 。例如， `/\w/` matches "a" in "apple", "5" in "$5.28", and "3" in "3D".
`\W`	Matches any character that is not a word character from the basic Latin alphabet. Equivalent to `[^A-Za-z0-9_]` 。例如， `/\W/` or `/[^A-Za-z0-9_]/` matches "%" in "50%".
`\s`	Matches a single white space character, including space, tab, form feed, line feed, and other Unicode spaces. Equivalent to `[ \f\n\r\t\v\u00a0\u1680\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff]` 。例如， `/\s\w*/` matches " bar" in "foo bar".
`\S`	Matches a single character other than white space. Equivalent to `[^ \f\n\r\t\v\u00a0\u1680\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff]` 。例如， `/\S\w*/` matches "foo" in "foo bar".
`\t`	Matches a horizontal tab.
`\r`	Matches a carriage return.
`\n`	Matches a linefeed.
`\v`	Matches a vertical tab.
`\f`	Matches a form-feed.
`[\b]`	Matches a backspace. If you're looking for the word-boundary character ( `\b` ), see Boundaries .
`\0`	Matches a NUL character. Do not follow this with another digit.
`\c X`	Matches a control character using caret notation , where "X" is a letter from A–Z (corresponding to codepoints `U+0001` – `U+001F` ). For example, `/\cM/` matches "\r" in "\r\n".
`\x hh`	Matches the character with the code `hh` (two hexadecimal digits).
`\u hhhh`	Matches a UTF-16 code-unit with the value `hhhh` (four hexadecimal digits).
`\u {hhhh} or \u{hhhhh}`	(Only when the `u` flag is set.) Matches the character with the Unicode value `U+ hhhh` or `U+ hhhhh` (hexadecimal digits).
`\`	Indicates that the following character should be treated specially, or "escaped". It behaves one of two ways. For characters that are usually treated literally, indicates that the next character is special and not to be interpreted literally. For example, `/b/` matches the character "b". By placing a backslash in front of "b", that is by using `/\b/` , the character becomes special to mean match a word boundary. For characters that are usually treated specially, indicates that the next character is not special and should be interpreted literally. For example, "" is a special character that means 0 or more occurrences of the preceding character should be matched; for example, `/a/` means match 0 or more "a"s. To match `` literally, precede it with a backslash; for example, `/a\/` matches "a*". To match this character literally, escape it with itself. In other words to search for `\` use `/\\/` .

范例

Looking for a series of digits

var randomData = "015 354 8787 687351 3512 8735";
var regexpFourDigits = /\b\d{4}\b/g;
// \b indicates a boundary (i.e. do not start matching in the middle of a word)
// \d{4} indicates a digit, four times
// \b indicates another boundary (i.e. do not end matching in the middle of a word)
console.table(randomData.match(regexpFourDigits));
// ['8787', '3512', '8735']

Looking for a word (from the latin alphabet) starting with A

var aliceExcerpt = "I’m sure I’m not Ada,’ she said, ‘for her hair goes in such long ringlets, and mine doesn’t go in ringlets at all.";
var regexpWordStartingWithA = /\b[aA]\w+/g;
// \b indicates a boundary (i.e. do not start matching in the middle of a word)
// [aA] indicates the letter a or A
// \w+ indicates any character *from the latin alphabet*, multiple times
console.table(aliceExcerpt.match(regexpWordStartingWithA));
// ['Ada', 'and', 'at', 'all']

Looking for a word (from Unicode characters)

Instead of the Latin alphabet, we can use a range of Unicode characters to identify a word (thus being able to deal with text in other languages like Russian or Arabic). The "Basic Multilingual Plane" of Unicode contains most of the characters used around the world and we can use character classes and ranges to match words written with those characters.

var nonEnglishText = "Приключения Алисы в Стране чудес";
var regexpBMPWord = /([\u0000-\u0019\u0021-\uFFFF])+/gu;
// BMP goes through U+0000 to U+FFFF but space is U+0020
console.table(nonEnglishText.match(regexpBMPWord));
[ 'Приключения', 'Алисы', 'в', 'Стране', 'чудес' ]

Note for MDN editors: please do not try to add funny examples with emoji as those characters are not handled by the platform (Kuma).

规范

规范
ECMAScript Latest Draft (ECMA-262) The definition of 'RegExp: Character classes' in that specification.

浏览器兼容性

For browser compatibility information, check out the main Regular Expressions compatibility table .

另请参阅

元数据

最后修改： 2020 年 6 月 9 日

在此页