25,250 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
Score of 0
0 answers
20 views
How to set consistent font fallback for mixed English/Hindi text in PptxGenJS?
I'm generating a PowerPoint programmatically with PptxGenJS (Node.js) that mixes English and Hindi (Devanagari) text on the same slide — for example, a bilingual title like "Email Migration Guide ...
- reputation score 1
Advice
0
votes
4
replies
120
views
How can I check whether a string contains only digits in Go?
In Python, I can use the str.isdigit() method to check whether a string contains only numeric characters.
For example:
"12345".isdigit() # True
"123a".isdigit() # False
&...
- reputation score 1
Score of 0
1 answer
123 views
_kbhit() unusual behavior with Unicode codepage on Windows console
I'm setting a Windows console with code page UTF-8 (65001), but the results with _kbhit() are a bit erratic (described below), when inputting keys in the "Latin Supplement" range, and even ...
- reputation score 473
Best practices
0
votes
5
replies
143
views
Best alternative for ❌ (U+274C; Cross Mark)
So I am trying to have a simple enough error icon and obviously Unicode has some crosses and stuff there but the issue is that the one that makes sense in context, is annoyingly colored and looks ...
Best practices
0
votes
2
replies
64
views
Is Apple's UCCompareCollationKeys() a strong or a weak ordering?
I'm wondering whether the result of comparing collation sort keys (= already pre-processed strings for faster collation-compatible sorting/searching) is a strong or weak ordering. The implementation ...
- reputation score 25631
Score of 1
2 answers
203 views
Can font-size be set by unicode range?
I'd like to change the size of the symbols but they cannot be wrapped in additional HTML tags. Is there a way to target a unicode range when setting font-size? Such as,
span.name[...unicode range...] ...
Score of 1
1 answer
129 views
How to search and replace unicode characters with a Word macro?
I am trying to replace several Unicode characters in text strings in Microsoft Word.
The issue is when I try to use these text strings in other applications, the Unicode characters convert to a ...
Advice
0
votes
5
replies
88
views
How to get the name of characters made up of multiple codepoints with ICU?
I managed to find u_charName() for getting the name of a single character, but what about characters like flag emojis, which are made up of multiple codepoints? Do characters like that even have names?...
Best practices
0
votes
3
replies
100
views
OCR output contains "garbage" characters after special symbols (mojibake / control chars) — how to reliably clean before returning from LLM?
I have an on-prem OCR pipeline that returns extracted text inside a JSON blob. I parse the LLM response and call a local normalizer before returning the text to callers. Example call site:
result = ...
Score of -4
1 answer
219 views
Getting weird results from java string codepoints on a windows machine [closed]
package edu.practice.zapper;
import java.io.IOException;
import java.lang.ProcessBuilder.Redirect;
import java.nio.charset.Charset;
import java.util.ArrayList;
import java.util.Base64;
import java....
- reputation score 13
Score of 0
1 answer
85 views
Using XSLT3.0 in Saxon-JS 2, how can one configure the processor so that it accepts codepoints-to-string(8)?
Within Saxon-J, I can set the processor configuration to allow XML1.1 characters, for example by:
processor.getUnderlyingConfiguration().setXMLVersion(XML11);
I'm looking for the equivalent in Saxon-...
- reputation score 612
Score of 0
1 answer
165 views
UnicodeEncodeError: 'charmap' codec can't encode characters when writing to HTML
I have a pandas DataFrame that I wish to paste into an HTML document. The DataFrame contains Dingbat characters used as symbols to highlight values as good (checkmark), nearly bad (triangle), or bad (...
- reputation score 541
Score of 0
0 answers
70 views
Is there an equivalent of ICU4J's PersonNameFormatter in ICU4C?
I'm working on a Qt6/C++20 application that needs to handle localization. So far, we've gotten away with using Qt's build in localization; however, we want to add person names to our UI, with proper ...
Score of 4
0 answers
237 views
How can I apply tail/tails to a string of text in a Unicode-aware manner?
This "warning sign" character, ⚠️, corresponds to the sequence of codepoints U+26A0 U+FE0F (if I understand correctly, it is ⚠ followed by a variation selector character), so I can render it ...
Best practices
0
votes
2
replies
46
views
Unicode encoding of units of measure
I'm working on a proprietary font that uses custom glyphs for certain units of measure. For example, I want to display pH for acidity as a single glyph, or nm for nanometers, etc. This font is ...
- reputation score 130