std::regex_traits<CharT>::lookup_classname

If the character sequence [first, last) represents the name of a valid character class in the currently imbued locale (that is, the string between [: and :] in regular expressions), returns the implementation-defined value representing this character class. Otherwise, returns zero.

If the parameter icase is true, the character class ignores character case, e.g. the regex [:lower:] with std::regex_constants::icase generates a call to std::regex_traits <>::lookup_classname() with [first, last) indicating the string "lower" and icase == true. This call returns the same bitmask as the call generated by the regex [:alpha:] with icase == false.

The following narrow and wide character class names are always recognized by std::regex_traits <char> and std::regex_traits <wchar_t> respectively, and the classifications returned (with icase == false) correspond to the matching classifications obtained by the std::ctype facet of the imbued locale, as follows:

Character class name		std::ctype classification
Narrow	Wide	std::ctype classification
"alnum"	L"alnum"	std::ctype_base::alnum
"alpha"	L"alpha"	std::ctype_base::alpha
"blank"	L"blank"	std::ctype_base::blank
"cntrl"	L"cntrl"	std::ctype_base::cntrl
"digit"	L"digit"	std::ctype_base::digit
"graph"	L"graph"	std::ctype_base::graph
"lower"	L"lower"	std::ctype_base::lower
"print"	L"print"	std::ctype_base::print
"punct"	L"punct"	std::ctype_base::punct
"space"	L"space"	std::ctype_base::space
"upper"	L"upper"	std::ctype_base::upper
"xdigit"	L"xdigit"	std::ctype_base::xdigit
"d"	L"d"	std::ctype_base::digit
"s"	L"s"	std::ctype_base::space
"w"	L"w"	std::ctype_base::alnum with '_' optionally added

The classification returned for the string "w" may be exactly the same as "alnum", in which case isctype() adds '_' explicitly.

Additional classifications such as "jdigit" or "jkanji" may be provided by system-supplied locales (in which case they are also accessible through std::wctype ).

#include <cwctype>
#include <iostream>
#include <locale>
#include <regex>
 
// This custom regex traits uses wctype/iswctype to implement lookup_classname/isctype.
struct wctype_traits : std::regex_traits <wchar_t>
{
 using char_class_type = std::wctype_t ;
 
 template<class It>
 char_class_type lookup_classname(It first, It last, bool = false) const
 {
 return std::wctype (std::string (first, last).c_str());
 }
 
 bool isctype(wchar_t c, char_class_type f) const
 {
 return std::iswctype (c, f);
 }
};
 
int main()
{
 std::locale::global (std::locale ("ja_JP.utf8"));
 std::wcout.sync_with_stdio(false);
 std::wcout.imbue(std::locale ());
 
 std::wsmatch m;
 std::wstring in = L"風の谷のナウシカ";
 // matches all characters (they are classified as alnum)
 std::regex_search (in, m, std::wregex (L"([[:alnum:]]+)"));
 std::wcout << "alnums: " << m[1] << '\n'; // prints "風の谷のナウシカ"
 // matches only the katakana
 std::regex_search (in, m,
 std::basic_regex <wchar_t, wctype_traits>(L"([[:jkata:]]+)"));
 std::wcout << "katakana: " << m[1] << '\n'; // prints "ナウシカ"
}

Output:

alnums: 風の谷のナウシカ
katakana: ナウシカ

[edit] See also

isctype

indicates membership in a character class
(public member function)

wctype

looks up a character classification category in the current C locale
(function) [edit]

Retrieved from "https://en.cppreference.com/mwiki/index.php?title=cpp/regex/regex_traits/lookup_classname&oldid=156494"

cppreference.com

Namespaces

Variants

Views

Actions

std::regex_traits<CharT>::lookup_classname

Contents

[edit] Parameters

[edit] Return value

[edit] Example

[edit] See also

Navigation

Toolbox