16

I use a function from GitHub to my project.

Function sends a welcome email when a new user signs up and a goodbye email when user accounts are deleted. The function is loading to Firebase Cloud Function.

I'm trying to supplement the code so that it determines by the name of the user in what language he needs to send the message.

Example:

If the name of the user typed on the Hebrew language, the function sends a message on Hebrew to the user.

If the name of the user typed on the Russian language, the function sends a message on Russian to the user.

If the name of the user typed on the English language, the function sends a message on English to the user.

Note:

This does not connect with a browser, because a user will register from the android application. And after user Authentication on Firebase, he will get a message from Firebase Cloud Function.

In node.js the code below does not work!

if (/^[a-zA-Z]+$/.test(text)) //if the English language 
{
 ...
} 
else //if the not English language
{
 ...
}

I will glad to any helps!

Maybe there is another solution to localization the message?

Thanks!!!

Amol M Kulkarni
21.7k34 gold badges125 silver badges165 bronze badges
asked Apr 29, 2019 at 4:54
5
  • 2
    Have you considered to let the user choose the proffered language? Commented Apr 29, 2019 at 4:59
  • No, because the function is work via triggers: sign up and sign out. Commented Apr 29, 2019 at 5:04
  • Does a browser (in which a user is interacting) not know its own locale that can then be propagated in some HTTP header property? ... for example stackoverflow.com/questions/673905/… Commented Apr 29, 2019 at 5:08
  • 2
    It's a poor assumption to assume a certain language based on a name. For example, my last name originates from Italy, doesn't mean I speak it though. Commented Apr 29, 2019 at 5:08
  • 1
    Russia use Cyrillic, but so do other languages. If you want to guess the language, use the Accept-Language-header. It it the reason why browsers send it with (almost) every request. Commented Apr 29, 2019 at 5:14

4 Answers 4

15

You can use the languagedetect node.js library to detect the language of the string.

However, since your requirement is to send the message based on the user's language, it is better to provide him an option to select his preferred language or use javascript to detect language version of the browser with navigator.language

Amol M Kulkarni
21.7k34 gold badges125 silver badges165 bronze badges
answered Apr 29, 2019 at 5:03
Sign up to request clarification or add additional context in comments.

5 Comments

This is work, but I don't understand how this is to use. I make an experiment with my name Yury Matatov. I did not get english on the first position. Why?
Detecting language from sentences is easy and since names are proper nouns, they have high chances of being mistaken to be different languages. Like I said, it is better to go with providing the language preference to the user.
If you need more accuracy, you can try the paid google translate API npmjs.com/package/@google-cloud/translate Sample code can be found at cloud.google.com/translate/docs/detecting-language
Maybe it can be detected between Latin letters or Cyrillic?
that detects browser language, not the text language consider edge case I'm using and typing the Arabic language, and my browser language is English.
7

Facebook's FastText is the best solution for this problem which doesn't require some large slow machine learning model.

@smodin/fast-text-language-detection is how you can use it in a nodeJS application https://www.npmjs.com/package/@smodin/fast-text-language-detection (disclaimer: out of necessity, I'm the creator)

Context:

I Run a large multi-lingual site, and I was finding that franc and LanguageDetect (the current most popular nodeJS libraries) weren't accurate enough, despite implementing them for a month.

Based on further research, and this blog ( https://towardsdatascience.com/benchmarking-language-detection-for-nlp-8250ea8b67c ), I determined that facebook's FastText is the best solution out there because:

  1. It has better accuracy than typical approaches using short unicode blocks to predict languages which often fails on tasks with little text and abundance of proper nouns

  2. It doesn't have weird caveats which are abundant in the unicode predictions

Downside is that it's 150MB, so it's not a reasonable solution on the front end. It works best on longer text, but performs significantly better on shorter texts than franc and LanguageDetect

EDIT: Accuracy Testing. I've just added results of testing 550k sentences from 99 languages of sentences from 30-250 characters in length. The accuracy is around 99% for most major languages, even when the char length is reduced to 10-40 chars. See more here. I also added franc and languagedetect accuracies for reference here.

answered Sep 9, 2021 at 19:06

Comments

1
answered Apr 29, 2019 at 5:47

Comments

1

You can use franc and langs npm for detecting language. But first, you need to install franc and langs Installation: Write this code in the bash

$node i franc langs

Write this code in a js file named something as index.js

const franc = require('franc');
const langs = require('langs');
const input = process.argv[2];
const langcode=franc(input);
if(langcode==='und'){
console.log("Sorry couldnot find the language");}
else{
const language=langs.where("3",langcode);
console.log(`Our guess : ${language.name}`);}

How to run the file in bash?

$ node {filename}.js '{sentence}'

https://github.com/hassamq/Language-guesser-node-js

answered Dec 27, 2020 at 16:27

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.