Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
This repository was archived by the owner on Dec 22, 2023. It is now read-only.

Commit 384aac8

Browse files
Merge pull request #588 from AnkDos/HTML_to_PDF
Html to pdf
2 parents 2b8893f + 3d9eb66 commit 384aac8

File tree

6 files changed

+84
-0
lines changed

6 files changed

+84
-0
lines changed
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# HTML to PDF Converter
2+
## A python script to convert HTML to PDF by entering input file url / website url and output file name of the pdf as arg
3+
4+
## Setup Instructions
5+
6+
```
7+
8+
# Go to the root directoy of the project and install the requirements by typing :
9+
sudo pip3 install -r requirements.txt
10+
11+
# Run the script by typing :
12+
python3 app.py -inp <file url/ web url> -out <output file name>
13+
14+
Example (For web) :
15+
python3 app.py -inp https://www.google.com -out test_g.pdf
16+
17+
Example (For local HTML file):
18+
python3 app.py -inp /home/ankdos/ind.html -out test_f.pdf
19+
20+
# The output file will be stored in the ./outputs folder
21+
22+
# NOTE : If you face the problem of "parse() got an unexpected keyword argument 'override_encoding' " , then upgrade your html5lib by typing :
23+
24+
pip3 install --upgrade html5lib
25+
26+
```
27+
28+
## Screenshot taken of cli command :
29+
30+
![output](Screenshots/cli.jpg)
31+
32+
## Screenshot of the pdf output :
33+
34+
![output](Screenshots/out.jpg)
35+
36+
37+
### Author
38+
39+
[Ankur Pandey](https://github.com/ankdos)
88.4 KB
Loading[フレーム]
79.5 KB
Loading[フレーム]
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
import argparse
2+
import weasyprint
3+
4+
class Html2Pdf:
5+
""""""
6+
7+
def __init__(self, url, output_filename):
8+
""""""
9+
self.url = url
10+
self.output_filename = output_filename
11+
12+
def get_pdf(self):
13+
"""get the file url and create output"""
14+
pdf = weasyprint.HTML(self.url).write_pdf()
15+
file_name = 'output_files/' + self.output_filename
16+
with open(file_name, 'wb') as file_ :
17+
file_.write(pdf)
18+
19+
20+
if __name__ == '__main__':
21+
#taking the inputs from cli
22+
parser = argparse.ArgumentParser()
23+
parser.add_argument("-inp", "--input", help="input file url")
24+
parser.add_argument("-out", "--output", help="output file name")
25+
args = parser.parse_args()
26+
obj = Html2Pdf(url=args.input, output_filename=args.output)
27+
obj.get_pdf()
5.44 KB
Binary file not shown.
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
cairocffi==1.1.0
2+
CairoSVG==2.4.2
3+
certifi==2020年6月20日
4+
cffi==1.14.3
5+
chardet==3.0.4
6+
cssselect2==0.3.0
7+
defusedxml==0.6.0
8+
html5lib==1.1
9+
idna==2.10
10+
Pillow==8.0.0
11+
pycparser==2.20
12+
Pyphen==0.9.5
13+
requests==2.24.0
14+
six==1.15.0
15+
tinycss2==1.0.2
16+
urllib3==1.25.11
17+
WeasyPrint==51
18+
webencodings==0.5.1

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /