Javascript Regex Unicode


The \w or \W only matches ASCII based characters; for example, 
"a" to "z", "A" to "Z", "0" to "9" and "_". To match characters from other 
languages such as Cyrillic or Hebrew, use \uhhhh, where "hhhh" is the 
character's Unicode value in hexadecimal. This example demonstrates how one can 
separate out Unicode characters from a word.

var text = 'Образец text на русском языке';
var regex = /[\u0400-\u04FF]+/g;

var match = regex.exec(text);
console.log(match[0]);        // logs 'Образец'
console.log(regex.lastIndex); // logs '7'
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License