1

The documentation says that Intl.Segmenter needs a language parameter, something like this:

const result = new Intl.Segmenter("en", { granularity: "word" })

My tests show that Intl.Segmenter produces exactly the same segmentation, regardless of the language it is told to expect.

My understanding is that different languages require different word segmentation rules, and that Intl.Segmenter needs to be told which language to expect. However, my tests across European, Arabic, Indic and Asian languages, in the latest version of all major browsers, show that the language parameter seems to be ignored.

Perhaps this is pilot error. I will be delighted to discover if I am mistaken.

Perhaps I have not been testing any language where the default segmentation behaviour does not produce good results. In that case, I would be delighted to see examples of text in languages which make my test fail.

Here's a CodePen. You'll find a snippet below.

const languages = Array.from(document.querySelectorAll("p[lang]"))
 const texts = languages.map( language => language.textContent)
 const langs = languages.map( language => language.getAttribute("lang"))
 const granularities = ["grapheme", "word", "sentence"]
 console.log("langs:", langs, ", granularities:", granularities, ", texts:", texts)
 // Generate segmentations for every language and granularity
const getSegmentsAcrossLanguages = (text) => {
 const segments = langs.reduce((result, lang) => {
 const types = granularities.map(granularity => {
 const options = { granularity }
 const segmenter = new Intl.Segmenter(lang, options)
 return Array.from(segmenter.segment(text))
 })
 result[lang] = types
 return result
 }, {})
 return segments
}
// Generate comprehensive segmentation data for every text
const allSegments = languages.reduce((result, language) => {
 const text = language.textContent
 const vo = language.getAttribute("lang")
 const segments = getSegmentsAcrossLanguages(text)
 result[vo] = segments
 return result
}, {})
// Test if segmentations across granularities within each language are identical
const identical = Object.entries(allSegments).every(([vo, segmentMap]) => {
 const data = Object.entries(segmentMap)
 // Generate a cycle of pairs of segmentations for the same language and different granularities
 const pairs = data.map(([lang, segments], index) => {
 const compare = index ? data[index - 1][1] : data[data.length - 1][1]
 return [segments, compare]
 })
 // Compare within the same language and across granularities
 return pairs.every(([segments1, segments2]) => (
 JSON.stringify(segments1) === JSON.stringify(segments2)
 ))
})
document.getElementById("result").innerText = `All segmentations are identical, regardless of segmentation language: ${identical}`
span {
 font-size: 1.5em;
 color: red;
}
<span id="result"></span>
<p lang="am">አማርኛ ፡ የኢትዮጵያ ፡ መደበኛ ፡ ቋንቋ ፡ ነው ። ሴማዊ ፡ ቋንቋዎች ፡ ውስጥ ፡ የሚመደብ ፡ ሲሆን ፡ ካረቢኛ ፡ ቀጥሎ ፡ ሁለተኛ ፡ ብዙ ፡ ተናጋሪዎች ፡</p>
 <p lang="ar">اللغة العربية غنية بالمعاني والتراكيب النحوية التي تجعلها متميزة عن غيرها من اللغات.</p>
<p lang="bn">বাংলাদেশের সরকারী ভাষা বাংলা এবং এটি বিশ্বব্যাপী প্রায় ২৫ কোটি মানুষের মাতৃভাষা।</p>
<p lang="en">English is a West Germanic language that emerged in early medieval England and has since become a global lingua franca.</p>
<p lang="fr">Les premières occurrences du mot « France » en langue française se rencontrent au XIe siècle.</p>
<p lang="hi">भारत में बहुत सारी भाषाएँ बोली जाती हैं, और प्रत्येक भाषा का अपना अलग व्याकरण होता है।</p>
<p lang="ja">日本において、国号を直接かつ明確に規定した法令は存在しない。</p>
<p lang="ko">한국어(韓國語) 또는 조선어(朝鮮語)는 조선민주주의인민공화국의 공용어이다. 둘은 표기나 문법, 동사 어미, 표현에서 약간의 차이가 있다.</p>
<p lang="iu">ᐃᓅᔪᓕᒫᑦ ᐊᓂᖅᑎᕆᔪᓕᒫᑦ ᐃᓅᓚᐅᕐᒪᑕ ᐃᓱᒪᕐᓱᕐᓚᑎᒃ ᐊᒻᒪᓗ ᐊᔾᔨᐅᖃᑎᒦᒃᓗᑎᒃ ᓂᕐᓱᐊᖑᓂᒃᑯᑦ ᐊᒻᒪᓗ ᐱᔪᓐᓀᑎᑎᒍᑦ.</p>
<p lang="ru">Россия — многонациональное государство с широким этнокультурным многообразием.</p>
<p lang="ta">தமிழ் மொழி பல்வேறு இலக்கண விதிகளை கொண்டது, அது மற்ற மொழிகளிலிருந்து வேறுபட்டது.</p>
<p lang="th">ประเทศไทยมีประชากรเกือบ 66 ล้านคน พื้นที่ประมาณ 513,115 ตารางกิโลเมตร</p>

asked Dec 5, 2025 at 22:10

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.