正则表达式实战指南：从入门到精通

几年前我在一个用户上传接口里写了个看似人畜无害的校验正则，大概长这样 ^(\w+\s?)+$，本地跑单测全绿，上线那天也风平浪静。直到某个用户粘了一段两百多字符、结尾差一个不匹配字符的字符串进来——那台机器的 CPU 直接顶到 100%，请求挂在那里几十秒不返回，监控告警一片红。后来才知道这就是经典的灾难性回溯（catastrophic backtracking），也就是俗称的 ReDoS。那次半夜被叫起来排查的经历，让我对正则的态度从"真香"变成了"先想清楚再写"。

所以下面这篇东西不是教科书式的语法罗列，更多是我这些年踩坑攒下来的东西：哪些写法看着对其实是雷，哪些场景压根不该上正则。语法该讲还是会讲，但我更想让你少走我走过的弯路。

元字符速查（附我自己的踩坑备注）

每次写复杂正则之前我都会瞄一眼这张表，省得把贪婪和懒惰搞混：

写法	含义	踩坑点
`.*`	贪婪，尽可能多吃	默认行为，匹配 HTML 时经常一口吃到最后一个标签
`.*?`	懒惰，尽可能少吃	加个 `?` 就够了，但性能不一定比贪婪好，别迷信
`.*+`	独占，吃了不吐	JS 不支持（Java/PCRE 才有），但防回溯利器
`\d` `\w` `\s`	数字/单词字符/空白	`\w` 包含下划线，很多人忘了
`^` `$`	行首/行尾锚点	加 `m` 标志后含义变了，多行匹配栽过跟头
`(?=)` `(?<=)`	前瞻/后顾	后顾在老版本 Safari 上不支持，线上慎用
`\b`	单词边界	中文里没有"单词边界"，对汉字基本失效

它到底值在哪

抛开 ReDoS 那些坑不谈，正则确实能省事。就拿邮箱校验来说，不用正则你得手写一堆 if，写完还总有边界情况漏掉；用正则一行就把意图表达完了：

// ❌ 不用正则：验证邮箱地址（复杂且不完整）
function isEmailWithoutRegex(str) {
  if (!str.includes('@')) return false;
  const parts = str.split('@');
  if (parts.length !== 2) return false;
  if (parts[0].length === 0 || parts[1].length === 0) return false;
  if (!parts[1].includes('.')) return false;
  // ... 还有很多条件要检查
  return true;
}

// ✅ 使用正则：一行搞定
const isEmail = (str) => /^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(str);

我平时用得最多的几个场景：表单里校验邮箱、手机号、身份证；从一坨乱七八糟的日志里把关键字段抠出来；编辑器里批量替换、改变量名、更新一堆 API 调用。说实话日志分析这块正则帮我省的时间最多——以前要写脚本逐行 split，现在一个带捕获组的正则就搞定了。表单校验反而是雷区最多的地方，后面会专门讲。

正则表达式基础

基本语法

1. 字符匹配

// 精确匹配
/hello/.test("hello world")  // true
/hello/.test("Hello world")  // false (区分大小写)

// 不区分大小写
/hello/i.test("Hello world") // true

// 匹配任意字符（除换行符）
/./.test("a")   // true
/./.test("1")   // true
/./.test("\n")  // false

2. 字符类

// [abc] - 匹配 a、b 或 c
/[aeiou]/.test("hello")  // true (匹配到 'e')

// [^abc] - 不匹配 a、b、c
/[^aeiou]/.test("hello") // true (匹配到 'h')

// [a-z] - 范围匹配
/[a-z]/.test("Hello")    // true (匹配到 'e')
/[A-Z]/.test("Hello")    // true (匹配到 'H')
/[0-9]/.test("abc123")   // true (匹配到 '1')

// 组合使用
/[a-zA-Z0-9]/.test("Hello123")  // true

3. 预定义字符类

符号	等价于	含义
`\d`	`[0-9]`	任意数字
`\D`	`[^0-9]`	任意非数字
`\w`	`[a-zA-Z0-9_]`	字母、数字、下划线
`\W`	`[^a-zA-Z0-9_]`	非单词字符
`\s`	`[ \t\n\r\f\v]`	空白字符
`\S`	`[^ \t\n\r\f\v]`	非空白字符

// 实例
/\d/.test("abc123")      // true
/\d+/.test("abc123")     // true (匹配 "123")
/\w+/.test("hello_123")  // true (匹配整个字符串)
/\s/.test("hello world") // true (匹配空格)

4. 量词

量词	含义	示例
`*`	0次或多次	`/a*/` 匹配 "", "a", "aa"
`+`	1次或多次	`/a+/` 匹配 "a", "aa"，不匹配 ""
`?`	0次或1次	`/a?/` 匹配 "", "a"
`{n}`	恰好n次	`/a{3}/` 匹配 "aaa"
`{n,}`	至少n次	`/a{2,}/` 匹配 "aa", "aaa"
`{n,m}`	n到m次	`/a{2,4}/` 匹配 "aa", "aaa", "aaaa"

// 实例
/\d{3}/.test("123")           // true
/\d{3}/.test("12")            // false
/\d{3,}/.test("12345")        // true
/[a-z]{2,5}/.test("hello")    // true

5. 位置匹配

// ^ - 开头
/^hello/.test("hello world")  // true
/^hello/.test("say hello")    // false

// $ - 结尾
/world$/.test("hello world")  // true
/world$/.test("world hello")  // false

// \b - 单词边界
/\bcat\b/.test("cat")         // true
/\bcat\b/.test("category")    // false
/\bcat\b/.test("the cat sat") // true

// 组合使用：完整匹配
/^\d{3}$/.test("123")         // true
/^\d{3}$/.test("1234")        // false

实战案例

案例 1：验证手机号

// 中国大陆手机号：1开头，第二位3-9，共11位
const phoneRegex = /^1[3-9]\d{9}$/;

phoneRegex.test('13812345678'); // true
phoneRegex.test('12812345678'); // false (第二位是2)
phoneRegex.test('138123456789'); // false (12位)

// 更严格：指定运营商
const mobileRegex = /^1(3\d|4[5-9]|5[0-35-9]|6[2567]|7[0-8]|8\d|9[0-35-9])\d{8}$/;

案例 2：验证邮箱

// 基础版本
const basicEmail = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;

// 更严格的版本
const strictEmail = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;

strictEmail.test('user@example.com'); // true
strictEmail.test('user.name@example.com'); // true
strictEmail.test('user@sub.example.com'); // true
strictEmail.test('user@example'); // false
strictEmail.test('user name@example.com'); // false

顺便提个醒：别在邮箱校验上钻牛角尖。我以前接手过一个号称"超严格"的邮箱正则，结果把带 + 号的地址（比如 user+tag@gmail.com）和一些新顶级域名全给拒了，客服那边天天收到"为什么我邮箱填不进去"的投诉。真正完整符合 RFC 5322 的邮箱正则长得能吓死人，根本没必要。上面这个 strictEmail 对绝大多数业务够用了，剩下的交给"发一封验证邮件"去兜底，比正则靠谱得多。

案例 3：提取 URL

const urlRegex =
  /https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)/g;

const text = '访问 https://toolforge.com 和 http://example.com 查看更多';
const urls = text.match(urlRegex);
console.log(urls);
// ["https://toolforge.com", "http://example.com"]

案例 4：密码强度验证

// 至少8位，包含大小写字母、数字和特殊字符
function validatePassword(password) {
  const minLength = /.{8,}/; // 至少8位
  const hasUpper = /[A-Z]/; // 包含大写
  const hasLower = /[a-z]/; // 包含小写
  const hasNumber = /\d/; // 包含数字
  const hasSpecial = /[!@#$%^&*(),.?":{}|<>]/; // 包含特殊字符

  return (
    minLength.test(password) &&
    hasUpper.test(password) &&
    hasLower.test(password) &&
    hasNumber.test(password) &&
    hasSpecial.test(password)
  );
}

validatePassword('Abc123!@'); // true
validatePassword('abc123'); // false (无大写和特殊字符)
validatePassword('Abcdefgh'); // false (无数字和特殊字符)

// 一个正则实现（使用前瞻）
const strongPassword = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;

案例 5：身份证号验证

// 中国大陆身份证：18位
// 前6位地区码 + 8位生日 + 3位顺序码 + 1位校验码
const idCardRegex = /^[1-9]\d{5}(19|20)\d{2}(0[1-9]|1[0-2])(0[1-9]|[12]\d|3[01])\d{3}[\dXx]$/;

idCardRegex.test('110101199001011234'); // true
idCardRegex.test('11010119900101123X'); // true
idCardRegex.test('110101199013011234'); // false (13月)

案例 6：价格提取

// 提取文本中的价格
const priceRegex = /¥?\s?\d+(\.\d{2})?/g;

const text = '商品A：¥99.99，商品B：¥199，商品C：59.90元';
const prices = text.match(priceRegex);
console.log(prices);
// ["¥99.99", "¥199", "59.90"]

// 更精确：带货币符号
const cnPriceRegex = /¥\s?\d+(?:\.\d{2})?/g;

案例 7：日期格式化

// 匹配 YYYY-MM-DD 格式
const dateRegex = /^(\d{4})-(\d{2})-(\d{2})$/;

const match = '2026-01-03'.match(dateRegex);
if (match) {
  const [, year, month, day] = match;
  console.log(`年：${year}，月：${month}，日：${day}`);
}

// 转换为 MM/DD/YYYY
const convertDate = (dateStr) => {
  return dateStr.replace(/^(\d{4})-(\d{2})-(\d{2})$/, '$2/$3/$1');
};

convertDate('2026-01-03'); // "01/03/2026"

案例 8：HTML 标签移除

// 移除所有 HTML 标签
const stripHtml = (html) => html.replace(/<[^>]*>/g, '');

stripHtml('<p>Hello <strong>World</strong></p>'); // "Hello World"

// 移除特定标签，保留内容
const removeTag = (html, tag) => {
  const regex = new RegExp(`<${tag}[^>]*>|</${tag}>`, 'gi');
  return html.replace(regex, '');
};

removeTag('<p>Hello <strong>World</strong></p>', 'strong');
// "<p>Hello World</p>"

高级技巧

1. 捕获组

// 普通捕获组 ()
const nameRegex = /(\w+)\s(\w+)/;
const match = 'John Doe'.match(nameRegex);
console.log(match[1]); // "John" (第一个捕获组)
console.log(match[2]); // "Doe"  (第二个捕获组)

// 命名捕获组 (?<name>)
const nameRegex2 = /(?<first>\w+)\s(?<last>\w+)/;
const match2 = 'John Doe'.match(nameRegex2);
console.log(match2.groups.first); // "John"
console.log(match2.groups.last); // "Doe"

2. 非捕获组

// (?:...) - 分组但不捕获
const urlRegex = /https?:\/\/(?:www\.)?example\.com/;

'https://www.example.com'.match(urlRegex); // 匹配但不捕获 www.

3. 前瞻和后顾

// 正向前瞻 (?=...)
// 匹配后面是数字的单词
/\w+(?=\d)/.exec("abc123")  // "abc"

// 负向前瞻 (?!...)
// 匹配后面不是数字的单词
/\w+(?!\d)/.exec("abc def")  // "abc"

// 正向后顾 (?<=...)
// 匹配前面是 $ 的数字
/(?<=\$)\d+/.exec("$100")  // "100"

// 负向后顾 (?<!...)
// 匹配前面不是 $ 的数字
/(?<!\$)\d+/.exec("100")  // "100"

4. 贪婪 vs 非贪婪

// 贪婪模式（默认）：匹配尽可能多
/<.*>/.exec("<div>Hello</div>")  // "<div>Hello</div>"

// 非贪婪模式：加 ? 匹配尽可能少
/<.*?>/.exec("<div>Hello</div>")  // "<div>"

// 实际应用：提取 HTML 标签内容
const extractTags = (html) => {
  return html.match(/<(\w+)>.*?<\/\1>/g);
};

extractTags("<p>Text1</p><div>Text2</div>")
// ["<p>Text1</p>", "<div>Text2</div>"]

5. 反向引用

// \1 引用第一个捕获组
const duplicateRegex = /(\w+)\s+\1/;

duplicateRegex.test('hello hello'); // true
duplicateRegex.test('hello world'); // false

// 实际应用：查找重复单词
const text = 'the the cat sat on the mat mat';
const duplicates = text.match(/\b(\w+)\s+\1\b/g);
console.log(duplicates); // ["the the", "mat mat"]

性能优化

1. 避免回溯

// ❌ 可能导致灾难性回溯
const badRegex = /(a+)+b/;
badRegex.test('aaaaaaaaaaaaaaaaaaaaac'); // 可能卡死

// ✅ 优化版本
const goodRegex = /a+b/;

2. 使用具体字符类

// ❌ 性能较差
const slow = /.*@.*/;

// ✅ 性能更好
const fast = /[^@]+@[^@]+/;

3. 提前锚定

// ❌ 需要扫描整个字符串
const slow = /\d{3}/;

// ✅ 明确位置，更快
const fast = /^\d{3}$/;

常见错误和陷阱

错误 1：忘记转义特殊字符

// ❌ 错误：. 匹配任意字符
/example.com/.test("exampleXcom")  // true (不期望的)

// ✅ 正确：转义 .
/example\.com/.test("example.com")  // true

错误 2：不恰当的全局标志

const regex = /\d+/g;

// 第一次调用
console.log(regex.test('123')); // true
console.log(regex.lastIndex); // 3

// 第二次调用（从上次位置继续）
console.log(regex.test('123')); // false！
console.log(regex.lastIndex); // 0 (重置了)

// ✅ 每次创建新正则
const test = (str) => /\d+/.test(str);

错误 3：字符类中的特殊规则

// 在 [] 中，大部分特殊字符失去意义
/[.]/.test(".")      // true (. 不需要转义)
/[*]/.test("*")      // true (* 不需要转义)

// 但有例外
/[\]]/.test("]")     // true (] 需要转义)
/[^abc]/.test("d")   // true (^ 在开头有特殊含义)

调试技巧

1. 使用在线工具

推荐工具：

ToolsForge 正则测试器
Regex101.com
RegExr.com

2. 分步构建

// 从简单开始
let regex = /\d/; // 匹配数字
regex = /\d{3}/; // 3个数字
regex = /\d{3}-\d{4}/; // 添加分隔符
regex = /^\d{3}-\d{4}$/; // 完整匹配

// 最终：美国邮编
const zipCode = /^\d{5}(-\d{4})?$/;

3. 使用注释（verbose 模式）

// JavaScript 不支持 verbose 模式，但可以拼接
const complexRegex = new RegExp(
  '^' + // 开始
    '(?=.*[a-z])' + // 至少一个小写
    '(?=.*[A-Z])' + // 至少一个大写
    '(?=.*\\d)' + // 至少一个数字
    '.{8,}' + // 至少8个字符
    '$' // 结束
);

实用代码片段

验证工具函数

const validators = {
  // 邮箱
  email: (str) => /^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(str),

  // 手机号（中国）
  phone: (str) => /^1[3-9]\d{9}$/.test(str),

  // URL
  url: (str) => /^https?:\/\/.+/.test(str),

  // IPv4
  ipv4: (str) => /^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$/.test(str),

  // 十六进制颜色
  hexColor: (str) => /^#([0-9a-fA-F]{3}|[0-9a-fA-F]{6})$/.test(str),

  // 用户名（字母数字下划线，4-16位）
  username: (str) => /^[a-zA-Z0-9_]{4,16}$/.test(str),
};

// 使用
console.log(validators.email('test@example.com')); // true
console.log(validators.phone('13812345678')); // true

文本处理工具

const textUtils = {
  // 驼峰转下划线
  camelToSnake: (str) => str.replace(/[A-Z]/g, (letter) => `_${letter.toLowerCase()}`),

  // 下划线转驼峰
  snakeToCamel: (str) => str.replace(/_([a-z])/g, (_, letter) => letter.toUpperCase()),

  // 移除多余空格
  trimSpaces: (str) => str.replace(/\s+/g, ' ').trim(),

  // 提取数字
  extractNumbers: (str) => str.match(/\d+/g) || [],

  // 高亮关键词
  highlight: (text, keyword) => {
    const regex = new RegExp(`(${keyword})`, 'gi');
    return text.replace(regex, '<mark>$1</mark>');
  },
};

// 使用
console.log(textUtils.camelToSnake('userName')); // "user_name"
console.log(textUtils.snakeToCamel('user_name')); // "userName"
console.log(textUtils.extractNumbers('abc123xyz456')); // ["123", "456"]

最后说几句心里话

正则很强，但它绝对不是万能的，这点我得反复强调。

最典型的反面教材就是用正则解析 HTML。StackOverflow 上有个流传多年的神回答，有人问"怎么用正则匹配 HTML 标签"，下面一位老哥写了一长段近乎癫狂的咆哮体，大意是"你不能用正则解析 HTML，正则配不上 HTML 这种非正则语言，HTML 会吞噬你的灵魂"——读起来像中世纪驱魔文，但道理是真的。HTML、XML、JSON 这类带嵌套结构的东西，请老老实实用现成的 parser，别手写正则去硬刚。我见过太多人在嵌套标签上栽跟头，前面案例里那个 <(\w+)>.*?<\/\1> 也只能处理最简单的情况，标签一嵌套就废。

我自己的判断标准很糙：能用一行正则讲清楚意图的，就用；一旦正则长到自己第二天都看不懂、或者要靠三层前瞻硬凑的，那大概率是该换 parser 或者拆成普通代码的信号了。可读性这东西，三个月后的你会感谢现在的你。

还有一点，写正则免不了要拿真实数据去试，但很多在线工具会把你贴进去的内容传到它们服务器上——如果里面是用户手机号、订单号这类东西，多少有点慌。我们的正则表达式测试器是 100% 跑在你浏览器本地的，输入框里的数据一个字节都不会离开你的电脑，调试敏感数据的时候用着安心。

好了，正则这玩意儿熟能生巧，多写几个真实需求自然就有感觉了。祝你不会像我当年那样半夜被 ReDoS 叫起来。

相关阅读：

关于作者

ToolsForge 团队

我们是一支开发者小团队，负责构建和维护 ToolsForge —— 150+ 个完全在浏览器本地运行、隐私优先的工具。这些文章来自我们日常的工程实践，而不是纸上谈兵。

了解我们

正则表达式实战指南：从入门到精通

元字符速查（附我自己的踩坑备注）

每次写复杂正则之前我都会瞄一眼这张表，省得把贪婪和懒惰搞混：

写法	含义	踩坑点
`.*`	贪婪，尽可能多吃	默认行为，匹配 HTML 时经常一口吃到最后一个标签
`.*?`	懒惰，尽可能少吃	加个 `?` 就够了，但性能不一定比贪婪好，别迷信
`.*+`	独占，吃了不吐	JS 不支持（Java/PCRE 才有），但防回溯利器
`\d` `\w` `\s`	数字/单词字符/空白	`\w` 包含下划线，很多人忘了
`^` `$`	行首/行尾锚点	加 `m` 标志后含义变了，多行匹配栽过跟头
`(?=)` `(?<=)`	前瞻/后顾	后顾在老版本 Safari 上不支持，线上慎用
`\b`	单词边界	中文里没有"单词边界"，对汉字基本失效

它到底值在哪

抛开 ReDoS 那些坑不谈，正则确实能省事。就拿邮箱校验来说，不用正则你得手写一堆 if，写完还总有边界情况漏掉；用正则一行就把意图表达完了：

// ❌ 不用正则：验证邮箱地址（复杂且不完整）
function isEmailWithoutRegex(str) {
  if (!str.includes('@')) return false;
  const parts = str.split('@');
  if (parts.length !== 2) return false;
  if (parts[0].length === 0 || parts[1].length === 0) return false;
  if (!parts[1].includes('.')) return false;
  // ... 还有很多条件要检查
  return true;
}

// ✅ 使用正则：一行搞定
const isEmail = (str) => /^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(str);

正则表达式基础

基本语法

1. 字符匹配

// 精确匹配
/hello/.test("hello world")  // true
/hello/.test("Hello world")  // false (区分大小写)

// 不区分大小写
/hello/i.test("Hello world") // true

// 匹配任意字符（除换行符）
/./.test("a")   // true
/./.test("1")   // true
/./.test("\n")  // false

2. 字符类

// [abc] - 匹配 a、b 或 c
/[aeiou]/.test("hello")  // true (匹配到 'e')

// [^abc] - 不匹配 a、b、c
/[^aeiou]/.test("hello") // true (匹配到 'h')

// [a-z] - 范围匹配
/[a-z]/.test("Hello")    // true (匹配到 'e')
/[A-Z]/.test("Hello")    // true (匹配到 'H')
/[0-9]/.test("abc123")   // true (匹配到 '1')

// 组合使用
/[a-zA-Z0-9]/.test("Hello123")  // true

3. 预定义字符类

符号	等价于	含义
`\d`	`[0-9]`	任意数字
`\D`	`[^0-9]`	任意非数字
`\w`	`[a-zA-Z0-9_]`	字母、数字、下划线
`\W`	`[^a-zA-Z0-9_]`	非单词字符
`\s`	`[ \t\n\r\f\v]`	空白字符
`\S`	`[^ \t\n\r\f\v]`	非空白字符

// 实例
/\d/.test("abc123")      // true
/\d+/.test("abc123")     // true (匹配 "123")
/\w+/.test("hello_123")  // true (匹配整个字符串)
/\s/.test("hello world") // true (匹配空格)

4. 量词

量词	含义	示例
`*`	0次或多次	`/a*/` 匹配 "", "a", "aa"
`+`	1次或多次	`/a+/` 匹配 "a", "aa"，不匹配 ""
`?`	0次或1次	`/a?/` 匹配 "", "a"
`{n}`	恰好n次	`/a{3}/` 匹配 "aaa"
`{n,}`	至少n次	`/a{2,}/` 匹配 "aa", "aaa"
`{n,m}`	n到m次	`/a{2,4}/` 匹配 "aa", "aaa", "aaaa"

// 实例
/\d{3}/.test("123")           // true
/\d{3}/.test("12")            // false
/\d{3,}/.test("12345")        // true
/[a-z]{2,5}/.test("hello")    // true

5. 位置匹配

// ^ - 开头
/^hello/.test("hello world")  // true
/^hello/.test("say hello")    // false

// $ - 结尾
/world$/.test("hello world")  // true
/world$/.test("world hello")  // false

// \b - 单词边界
/\bcat\b/.test("cat")         // true
/\bcat\b/.test("category")    // false
/\bcat\b/.test("the cat sat") // true

// 组合使用：完整匹配
/^\d{3}$/.test("123")         // true
/^\d{3}$/.test("1234")        // false

实战案例

案例 1：验证手机号

// 中国大陆手机号：1开头，第二位3-9，共11位
const phoneRegex = /^1[3-9]\d{9}$/;

phoneRegex.test('13812345678'); // true
phoneRegex.test('12812345678'); // false (第二位是2)
phoneRegex.test('138123456789'); // false (12位)

// 更严格：指定运营商
const mobileRegex = /^1(3\d|4[5-9]|5[0-35-9]|6[2567]|7[0-8]|8\d|9[0-35-9])\d{8}$/;

案例 2：验证邮箱

// 基础版本
const basicEmail = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;

// 更严格的版本
const strictEmail = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;

strictEmail.test('user@example.com'); // true
strictEmail.test('user.name@example.com'); // true
strictEmail.test('user@sub.example.com'); // true
strictEmail.test('user@example'); // false
strictEmail.test('user name@example.com'); // false

案例 3：提取 URL

const urlRegex =
  /https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)/g;

const text = '访问 https://toolforge.com 和 http://example.com 查看更多';
const urls = text.match(urlRegex);
console.log(urls);
// ["https://toolforge.com", "http://example.com"]

案例 4：密码强度验证

// 至少8位，包含大小写字母、数字和特殊字符
function validatePassword(password) {
  const minLength = /.{8,}/; // 至少8位
  const hasUpper = /[A-Z]/; // 包含大写
  const hasLower = /[a-z]/; // 包含小写
  const hasNumber = /\d/; // 包含数字
  const hasSpecial = /[!@#$%^&*(),.?":{}|<>]/; // 包含特殊字符

  return (
    minLength.test(password) &&
    hasUpper.test(password) &&
    hasLower.test(password) &&
    hasNumber.test(password) &&
    hasSpecial.test(password)
  );
}

validatePassword('Abc123!@'); // true
validatePassword('abc123'); // false (无大写和特殊字符)
validatePassword('Abcdefgh'); // false (无数字和特殊字符)

// 一个正则实现（使用前瞻）
const strongPassword = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;

案例 5：身份证号验证

// 中国大陆身份证：18位
// 前6位地区码 + 8位生日 + 3位顺序码 + 1位校验码
const idCardRegex = /^[1-9]\d{5}(19|20)\d{2}(0[1-9]|1[0-2])(0[1-9]|[12]\d|3[01])\d{3}[\dXx]$/;

idCardRegex.test('110101199001011234'); // true
idCardRegex.test('11010119900101123X'); // true
idCardRegex.test('110101199013011234'); // false (13月)

案例 6：价格提取

// 提取文本中的价格
const priceRegex = /¥?\s?\d+(\.\d{2})?/g;

const text = '商品A：¥99.99，商品B：¥199，商品C：59.90元';
const prices = text.match(priceRegex);
console.log(prices);
// ["¥99.99", "¥199", "59.90"]

// 更精确：带货币符号
const cnPriceRegex = /¥\s?\d+(?:\.\d{2})?/g;

案例 7：日期格式化

// 匹配 YYYY-MM-DD 格式
const dateRegex = /^(\d{4})-(\d{2})-(\d{2})$/;

const match = '2026-01-03'.match(dateRegex);
if (match) {
  const [, year, month, day] = match;
  console.log(`年：${year}，月：${month}，日：${day}`);
}

// 转换为 MM/DD/YYYY
const convertDate = (dateStr) => {
  return dateStr.replace(/^(\d{4})-(\d{2})-(\d{2})$/, '$2/$3/$1');
};

convertDate('2026-01-03'); // "01/03/2026"

案例 8：HTML 标签移除

// 移除所有 HTML 标签
const stripHtml = (html) => html.replace(/<[^>]*>/g, '');

stripHtml('<p>Hello <strong>World</strong></p>'); // "Hello World"

// 移除特定标签，保留内容
const removeTag = (html, tag) => {
  const regex = new RegExp(`<${tag}[^>]*>|</${tag}>`, 'gi');
  return html.replace(regex, '');
};

removeTag('<p>Hello <strong>World</strong></p>', 'strong');
// "<p>Hello World</p>"

高级技巧

1. 捕获组

// 普通捕获组 ()
const nameRegex = /(\w+)\s(\w+)/;
const match = 'John Doe'.match(nameRegex);
console.log(match[1]); // "John" (第一个捕获组)
console.log(match[2]); // "Doe"  (第二个捕获组)

// 命名捕获组 (?<name>)
const nameRegex2 = /(?<first>\w+)\s(?<last>\w+)/;
const match2 = 'John Doe'.match(nameRegex2);
console.log(match2.groups.first); // "John"
console.log(match2.groups.last); // "Doe"

2. 非捕获组

// (?:...) - 分组但不捕获
const urlRegex = /https?:\/\/(?:www\.)?example\.com/;

'https://www.example.com'.match(urlRegex); // 匹配但不捕获 www.

3. 前瞻和后顾

// 正向前瞻 (?=...)
// 匹配后面是数字的单词
/\w+(?=\d)/.exec("abc123")  // "abc"

// 负向前瞻 (?!...)
// 匹配后面不是数字的单词
/\w+(?!\d)/.exec("abc def")  // "abc"

// 正向后顾 (?<=...)
// 匹配前面是 $ 的数字
/(?<=\$)\d+/.exec("$100")  // "100"

// 负向后顾 (?<!...)
// 匹配前面不是 $ 的数字
/(?<!\$)\d+/.exec("100")  // "100"

4. 贪婪 vs 非贪婪

// 贪婪模式（默认）：匹配尽可能多
/<.*>/.exec("<div>Hello</div>")  // "<div>Hello</div>"

// 非贪婪模式：加 ? 匹配尽可能少
/<.*?>/.exec("<div>Hello</div>")  // "<div>"

// 实际应用：提取 HTML 标签内容
const extractTags = (html) => {
  return html.match(/<(\w+)>.*?<\/\1>/g);
};

extractTags("<p>Text1</p><div>Text2</div>")
// ["<p>Text1</p>", "<div>Text2</div>"]

5. 反向引用

// \1 引用第一个捕获组
const duplicateRegex = /(\w+)\s+\1/;

duplicateRegex.test('hello hello'); // true
duplicateRegex.test('hello world'); // false

// 实际应用：查找重复单词
const text = 'the the cat sat on the mat mat';
const duplicates = text.match(/\b(\w+)\s+\1\b/g);
console.log(duplicates); // ["the the", "mat mat"]

性能优化

1. 避免回溯

// ❌ 可能导致灾难性回溯
const badRegex = /(a+)+b/;
badRegex.test('aaaaaaaaaaaaaaaaaaaaac'); // 可能卡死

// ✅ 优化版本
const goodRegex = /a+b/;

2. 使用具体字符类

// ❌ 性能较差
const slow = /.*@.*/;

// ✅ 性能更好
const fast = /[^@]+@[^@]+/;

3. 提前锚定

// ❌ 需要扫描整个字符串
const slow = /\d{3}/;

// ✅ 明确位置，更快
const fast = /^\d{3}$/;

常见错误和陷阱

错误 1：忘记转义特殊字符

// ❌ 错误：. 匹配任意字符
/example.com/.test("exampleXcom")  // true (不期望的)

// ✅ 正确：转义 .
/example\.com/.test("example.com")  // true

错误 2：不恰当的全局标志

const regex = /\d+/g;

// 第一次调用
console.log(regex.test('123')); // true
console.log(regex.lastIndex); // 3

// 第二次调用（从上次位置继续）
console.log(regex.test('123')); // false！
console.log(regex.lastIndex); // 0 (重置了)

// ✅ 每次创建新正则
const test = (str) => /\d+/.test(str);

错误 3：字符类中的特殊规则

// 在 [] 中，大部分特殊字符失去意义
/[.]/.test(".")      // true (. 不需要转义)
/[*]/.test("*")      // true (* 不需要转义)

// 但有例外
/[\]]/.test("]")     // true (] 需要转义)
/[^abc]/.test("d")   // true (^ 在开头有特殊含义)

调试技巧

1. 使用在线工具

推荐工具：

ToolsForge 正则测试器
Regex101.com
RegExr.com

2. 分步构建

// 从简单开始
let regex = /\d/; // 匹配数字
regex = /\d{3}/; // 3个数字
regex = /\d{3}-\d{4}/; // 添加分隔符
regex = /^\d{3}-\d{4}$/; // 完整匹配

// 最终：美国邮编
const zipCode = /^\d{5}(-\d{4})?$/;

3. 使用注释（verbose 模式）

// JavaScript 不支持 verbose 模式，但可以拼接
const complexRegex = new RegExp(
  '^' + // 开始
    '(?=.*[a-z])' + // 至少一个小写
    '(?=.*[A-Z])' + // 至少一个大写
    '(?=.*\\d)' + // 至少一个数字
    '.{8,}' + // 至少8个字符
    '$' // 结束
);

实用代码片段

验证工具函数

const validators = {
  // 邮箱
  email: (str) => /^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(str),

  // 手机号（中国）
  phone: (str) => /^1[3-9]\d{9}$/.test(str),

  // URL
  url: (str) => /^https?:\/\/.+/.test(str),

  // IPv4
  ipv4: (str) => /^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$/.test(str),

  // 十六进制颜色
  hexColor: (str) => /^#([0-9a-fA-F]{3}|[0-9a-fA-F]{6})$/.test(str),

  // 用户名（字母数字下划线，4-16位）
  username: (str) => /^[a-zA-Z0-9_]{4,16}$/.test(str),
};

// 使用
console.log(validators.email('test@example.com')); // true
console.log(validators.phone('13812345678')); // true

文本处理工具

const textUtils = {
  // 驼峰转下划线
  camelToSnake: (str) => str.replace(/[A-Z]/g, (letter) => `_${letter.toLowerCase()}`),

  // 下划线转驼峰
  snakeToCamel: (str) => str.replace(/_([a-z])/g, (_, letter) => letter.toUpperCase()),

  // 移除多余空格
  trimSpaces: (str) => str.replace(/\s+/g, ' ').trim(),

  // 提取数字
  extractNumbers: (str) => str.match(/\d+/g) || [],

  // 高亮关键词
  highlight: (text, keyword) => {
    const regex = new RegExp(`(${keyword})`, 'gi');
    return text.replace(regex, '<mark>$1</mark>');
  },
};

// 使用
console.log(textUtils.camelToSnake('userName')); // "user_name"
console.log(textUtils.snakeToCamel('user_name')); // "userName"
console.log(textUtils.extractNumbers('abc123xyz456')); // ["123", "456"]

最后说几句心里话

正则很强，但它绝对不是万能的，这点我得反复强调。

好了，正则这玩意儿熟能生巧，多写几个真实需求自然就有感觉了。祝你不会像我当年那样半夜被 ReDoS 叫起来。

相关阅读：

关于作者

ToolsForge 团队

了解我们

正则表达式实战指南：从入门到精通

正则表达式实战指南：从入门到精通

元字符速查（附我自己的踩坑备注）

它到底值在哪

正则表达式基础

基本语法

1. 字符匹配

2. 字符类

3. 预定义字符类

4. 量词

5. 位置匹配

实战案例

案例 1：验证手机号

案例 2：验证邮箱

案例 3：提取 URL

案例 4：密码强度验证

案例 5：身份证号验证

案例 6：价格提取

案例 7：日期格式化

案例 8：HTML 标签移除

高级技巧

1. 捕获组

2. 非捕获组

3. 前瞻和后顾

4. 贪婪 vs 非贪婪

5. 反向引用

性能优化

1. 避免回溯

2. 使用具体字符类

3. 提前锚定

常见错误和陷阱

错误 1：忘记转义特殊字符

错误 2：不恰当的全局标志

错误 3：字符类中的特殊规则

调试技巧

1. 使用在线工具

2. 分步构建

3. 使用注释（verbose 模式）

实用代码片段

验证工具函数

文本处理工具

最后说几句心里话

相关阅读

正则表达式完全教程：从入门到精通

JSON 格式化完全指南：从入门到精通

一张照片能做多少事：AI 改图工具实战指南

正则表达式实战指南：从入门到精通

正则表达式实战指南：从入门到精通

元字符速查（附我自己的踩坑备注）

它到底值在哪

正则表达式基础

基本语法

1. 字符匹配

2. 字符类

3. 预定义字符类

4. 量词

5. 位置匹配

实战案例

案例 1：验证手机号

案例 2：验证邮箱

案例 3：提取 URL

案例 4：密码强度验证

案例 5：身份证号验证

案例 6：价格提取

案例 7：日期格式化

案例 8：HTML 标签移除

高级技巧

1. 捕获组

2. 非捕获组

3. 前瞻和后顾

4. 贪婪 vs 非贪婪

5. 反向引用

性能优化

1. 避免回溯

2. 使用具体字符类

3. 提前锚定

常见错误和陷阱

错误 1：忘记转义特殊字符

错误 2：不恰当的全局标志

错误 3：字符类中的特殊规则