0

In reportLab, I need to display an arabic paragraph. I am using arabic-resharper and bidi-algorithm. The problem appears in the algorithm that reverses lines.

My code is:

def format_arabic_paragraph(arabic_text: str, paragraph_style: ParagraphStyle):
 reshaped_text = arabic_reshaper.reshape(arabic_text)
 bidirectional_text = algorithm.get_display(reshaped_text)
 formatted_paragraph = Paragraph(bidirectional_text, paragraph_style)
 return formatted_paragraph
arabic_text='وحدة1 وحدة2'

Here's the output:
وحدة2
وحدة1

Here's what I expected:
وحدة1
وحدة2

TechyBenji
5552 gold badges7 silver badges23 bronze badges
asked Jul 12, 2023 at 13:33
18
  • I have tried another library called pyfribidi. It seems that the problem with ReportLab staring Arabic paragraph from buttom. Commented Jul 12, 2023 at 20:06
  • bidi-algorithm converts the string from logical ordering to visual ordering, and in the process reverse the order of words within the string. First word in arabic_text is وحدة1 but first word after reorderng the string using bidi_algorithm is وحدة2. You would need to reverse the order of the words before return. Commented Jul 15, 2023 at 14:10
  • It does not reverse the word order. It reverses the lines by rendering the lines from the bottom of the paragraph. Hence, I think the problem is with the report lab and how it renders RTL text. Now, I'm working on a text wrapper to fix this problem. However, I would like to see if there is a stable solution. Commented Jul 16, 2023 at 5:37
  • Converting it from logical to visual reverses the string, but that in itself automatically changes the word order. In Logical order first character in data should be the right most character in display. First word in data is word rendered right most on first line. Converting the string to visual order changes string so left most word is first word in string (but word is backwards). It's easier to observe when looking at the codepoints. Commented Jul 16, 2023 at 5:49
  • arabic_text is U+0648 (و) U+062D (ح) U+062F (د) U+0629 (ة) U+0031 (1) U+0020( ) U+0648 (و) U+062D (ح) U+062F (د) U+0629 (ة) U+0032 (2). So first sub-string is U+0648 (و) U+062D (ح) U+062F (د) U+0629 (ة) U+0031 (1) while second is U+0648 (و) U+062D (ح) U+062F (د) U+0629 (ة) U+0032 (2). For bidirectional_text you have U+0032 (2) U+FE93 (ة) U+FEAA (د) U+FEA3 (ح) U+FEED (و) U+0020 ( ) U+0031 (1) U+FE93 (ة) U+FEAA (د) U+FEA3 (ح) U+FEED (و). So first sub-string is U+0032 (2) U+FE93 (ة) U+FEAA (د) U+FEA3 (ح) U+FEED (و) and 2nd: U+0031 (1) U+FE93 (ة) U+FEAA (د) U+FEA3 (ح) U+FEED (و). Commented Jul 16, 2023 at 8:24

1 Answer 1

1

Reportlab has partial bidirectional support using Fribidi. It is disabled by default. There is an option rtlSupport in reportlab/rl_settings. See user guide section on site configuration. In my installation, I added a file ~/.reportlab_settings and added the line rtlSupport=1.

In ParagraphStyle() set wordWrap="RTL".

An example:

from reportlab.platypus import Paragraph
from reportlab.lib.enums import TA_RIGHT
from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.ttfonts import TTFont
from reportlab.lib.pagesizes import LETTER
from reportlab.lib.styles import ParagraphStyle, getSampleStyleSheet
from reportlab.platypus.doctemplate import SimpleDocTemplate
pdfmetrics.registerFont(TTFont("Arial Unicode", "Arial Unicode.ttf"))
arabic_text='وحدة1 وحدة2'
doc = SimpleDocTemplate(
 "repro_ar.pdf",
 pagesize=LETTER,
 rightMargin=280,
 leftMargin=280,
 topMargin=72,
 bottomMargin=72,
)
styles = getSampleStyleSheet()
normal_arabic = ParagraphStyle(
 parent=styles["Normal"],
 name="NormalArabic",
 wordWrap="RTL",
 alignment=TA_RIGHT,
 fontName="Arial Unicode",
 fontSize=14,
 leading=16
)
flowables = [Paragraph(arabic_text, normal_arabic)]
doc.build(flowables)

This gives me:

enter image description here

answered Jul 17, 2023 at 12:09
Sign up to request clarification or add additional context in comments.

6 Comments

Work as you said. Thank you.
Sorry to remove that resolves it. I have a bug with this. Wrapping a long string produces an indentation.
f"اللقب والاسم: اللقب والاسم: اللقب والاسم: " f"<br />" f" اللقب والاسم: " f"<br />" you could try this and watch the difference.
Resolved. I've to use arabic resharper too. supplier_string_format = format_supplier_string(supplier) reshaped_text = arabic_reshaper.reshape(supplier_string_format) return Paragraph(reshaped_text, ___leading_text_style)
@Mounir, In theory fribidi also does the same reshaping, but I suspect that the line lengths/layout calculations are done before the text is passed to fribidi and throwing the lines off. I tried different margins, different lines of numbers, the results were not consistent. As you indicate arabic_resharper applied before creating the Paragraph object solved the problem. As a rule I tend to use from arabic_reshaper import ArabicReshaper so i can pass configuration information including language to the reshaper.
|

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.