This repository was archived by the owner on Dec 22, 2023. It is now read-only.

Commit 384aac8

authored

Merge pull request #588 from AnkDos/HTML_to_PDF

Html to pdf

2 parents 2b8893f + 3d9eb66 commit 384aac8Copy full SHA for 384aac8

File tree

6 files changed

+84

-0

lines changed

Scripts/Miscellaneous/HTML_to_PDF_converter
- README.md
- Screenshots
  - cli.jpg
  - out.jpg
- app.py
- output_files
  - test_f.pdf
- requirements.txt

6 files changed

+84

-0

lines changed

`‎Scripts/Miscellaneous/HTML_to_PDF_converter/README.md‎`

Lines changed: 39 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,39 @@`
	`1`	`+# HTML to PDF Converter`
	`2`	`+## A python script to convert HTML to PDF by entering input file url / website url and output file name of the pdf as arg`
	`3`	`+`
	`4`	`+## Setup Instructions`
	`5`	`+`
	`6`	+```
	`7`	`+`
	`8`	`+# Go to the root directoy of the project and install the requirements by typing :`
	`9`	`+ sudo pip3 install -r requirements.txt`
	`10`	`+`
	`11`	`+# Run the script by typing :`
	`12`	`+ python3 app.py -inp <file url/ web url> -out <output file name>`
	`13`	`+`
	`14`	`+ Example (For web) :`
	`15`	`+ python3 app.py -inp https://www.google.com -out test_g.pdf`
	`16`	`+`
	`17`	`+ Example (For local HTML file):`
	`18`	`+ python3 app.py -inp /home/ankdos/ind.html -out test_f.pdf`
	`19`	`+`
	`20`	`+# The output file will be stored in the ./outputs folder`
	`21`	`+`
	`22`	`+# NOTE : If you face the problem of "parse() got an unexpected keyword argument 'override_encoding' " , then upgrade your html5lib by typing :`
	`23`	`+`
	`24`	`+ pip3 install --upgrade html5lib`
	`25`	`+`
	`26`	+```
	`27`	`+`
	`28`	`+## Screenshot taken of cli command :`
	`29`	`+`
	`30`	`+ ![output](Screenshots/cli.jpg)`
	`31`	`+`
	`32`	`+## Screenshot of the pdf output :`
	`33`	`+`
	`34`	`+ ![output](Screenshots/out.jpg)`
	`35`	`+`
	`36`	`+`
	`37`	`+### Author`
	`38`	`+`
	`39`	`+[Ankur Pandey](https://github.com/ankdos)`

`‎Scripts/Miscellaneous/HTML_to_PDF_converter/Screenshots/cli.jpg‎`

88.4 KB

Loading[フレーム]

`‎Scripts/Miscellaneous/HTML_to_PDF_converter/Screenshots/out.jpg‎`

79.5 KB

Loading[フレーム]

`‎Scripts/Miscellaneous/HTML_to_PDF_converter/app.py‎`

Lines changed: 27 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,27 @@`
	`1`	`+import argparse`
	`2`	`+import weasyprint`
	`3`	`+`
	`4`	`+class Html2Pdf:`
	`5`	`+ """"""`
	`6`	`+`
	`7`	`+ def __init__(self, url, output_filename):`
	`8`	`+ """"""`
	`9`	`+ self.url = url`
	`10`	`+ self.output_filename = output_filename`
	`11`	`+`
	`12`	`+ def get_pdf(self):`
	`13`	`+ """get the file url and create output"""`
	`14`	`+ pdf = weasyprint.HTML(self.url).write_pdf()`
	`15`	`+ file_name = 'output_files/' + self.output_filename`
	`16`	`+ with open(file_name, 'wb') as file_ :`
	`17`	`+ file_.write(pdf)`
	`18`	`+`
	`19`	`+`
	`20`	`+if __name__ == '__main__':`
	`21`	`+ #taking the inputs from cli`
	`22`	`+ parser = argparse.ArgumentParser()`
	`23`	`+ parser.add_argument("-inp", "--input", help="input file url")`
	`24`	`+ parser.add_argument("-out", "--output", help="output file name")`
	`25`	`+ args = parser.parse_args()`
	`26`	`+ obj = Html2Pdf(url=args.input, output_filename=args.output)`
	`27`	`+ obj.get_pdf()`

`‎Scripts/Miscellaneous/HTML_to_PDF_converter/output_files/test_f.pdf‎`

5.44 KB

Binary file not shown.

`‎Scripts/Miscellaneous/HTML_to_PDF_converter/requirements.txt‎`

Lines changed: 18 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,18 @@`
	`1`	`+cairocffi==1.1.0`
	`2`	`+CairoSVG==2.4.2`
	`3`	`+certifi==2020年6月20日`
	`4`	`+cffi==1.14.3`
	`5`	`+chardet==3.0.4`
	`6`	`+cssselect2==0.3.0`
	`7`	`+defusedxml==0.6.0`
	`8`	`+html5lib==1.1`
	`9`	`+idna==2.10`
	`10`	`+Pillow==8.0.0`
	`11`	`+pycparser==2.20`
	`12`	`+Pyphen==0.9.5`
	`13`	`+requests==2.24.0`
	`14`	`+six==1.15.0`
	`15`	`+tinycss2==1.0.2`
	`16`	`+urllib3==1.25.11`
	`17`	`+WeasyPrint==51`
	`18`	`+webencodings==0.5.1`

0 commit comments

Comments

(0)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 384aac8

File tree

6 files changed

6 files changed

`‎Scripts/Miscellaneous/HTML_to_PDF_converter/README.md‎`

`‎Scripts/Miscellaneous/HTML_to_PDF_converter/Screenshots/cli.jpg‎`

`‎Scripts/Miscellaneous/HTML_to_PDF_converter/Screenshots/out.jpg‎`

`‎Scripts/Miscellaneous/HTML_to_PDF_converter/app.py‎`

`‎Scripts/Miscellaneous/HTML_to_PDF_converter/output_files/test_f.pdf‎`

`‎Scripts/Miscellaneous/HTML_to_PDF_converter/requirements.txt‎`

0 commit comments