Commit 38b8ec3

authored

Merge pull request avinashkranjan#1091 from Sukriti-sood/features

added pdf2video converter

2 parents 56d2e5c + 39f0dec commit 38b8ec3Copy full SHA for 38b8ec3

File tree

3 files changed

+134

-0

lines changed

Pdf2Video_Converter

3 files changed

+134

-0

lines changed

`‎Pdf2Video_Converter/Readme.md‎`

Lines changed: 32 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,32 @@`
	`1`	`+# Pdf2Video Converter`
	`2`	`+## Short description`
	`3`	`+It will take a pdf file and a video as input and Add text audio of pdf to the Given Video.`
	`4`	`+## Setup instructions`
	`5`	`+In order to run this script, you need to have Python and pip installed on your system. After you're done installing Python and pip, run the following command from your terminal to install the requirements from the same folder (directory) of the project.`
	`6`	+```
	`7`	`+pip install -r requirements.txt`
	`8`	+```
	`9`	`+`
	`10`	`+After satisfying all the requirements for the project:-`
	`11`	`+`
	`12`	`+- Place the required pdf and video in same folder as of script.`
	`13`	`+- Run following command :-`
	`14`	+ ```
	`15`	`+ python script.py`
	`16`	+ ```
	`17`	`+ or`
	`18`	+ ```
	`19`	`+ python3 script.py`
	`20`	+ ```
	`21`	`+Depending upon the python version. Make sure that you are running the command from the same virtual environment in which the required modules are installed.`
	`22`	`+`
	`23`	`+`
	`24`	`+## Output`
	`25`	`+User is asked for a Pdf file and video(mp4) file and Output Video file is stored in the folder`
	`26`	`+`
	`27`	`+`
	`28`	`+https://user-images.githubusercontent.com/55010599/119211625-f9181100-bad0-11eb-8b4b-272435807007.mp4`
	`29`	`+`
	`30`	`+`
	`31`	`+## Author(s)`
	`32`	`+[Sukriti Sood](https://github.com/Sukriti-sood)`

`‎Pdf2Video_Converter/requirements.txt‎`

Lines changed: 6 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,6 @@`
	`1`	`+gTTS==2.2.2`
	`2`	`+moviepy==1.0.3`
	`3`	`+mutagen==1.45.1`
	`4`	`+pdf2image==1.15.1`
	`5`	`+Pillow==8.2.0`
	`6`	`+pytesseract==0.3.7`

`‎Pdf2Video_Converter/script.py‎`

Lines changed: 96 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,96 @@`
	`1`	`+# Import libraries`
	`2`	`+from PIL import Image`
	`3`	`+import pytesseract`
	`4`	`+from mutagen.mp3 import MP3`
	`5`	`+from moviepy.editor import VideoFileClip`
	`6`	`+import moviepy.editor as mpe`
	`7`	`+from gtts import gTTS`
	`8`	`+from pdf2image import convert_from_path`
	`9`	`+import os`
	`10`	`+`
	`11`	`+`
	`12`	`+def pdf2text(PDF_file):`
	`13`	`+`
	`14`	`+ # Getting all pages of Pdf`
	`15`	`+ pages = convert_from_path(PDF_file, 500)`
	`16`	`+`
	`17`	`+ image_counter = 1`
	`18`	`+`
	`19`	`+ print("Converting to images......")`
	`20`	`+ for page in pages:`
	`21`	`+`
	`22`	`+ filename = "page_"+str(image_counter)+".jpg"`
	`23`	`+`
	`24`	`+ page.save(filename, 'JPEG')`
	`25`	`+`
	`26`	`+ image_counter = image_counter + 1`
	`27`	`+`
	`28`	`+ filelimit = image_counter-1`
	`29`	`+`
	`30`	`+ mtext = ""`
	`31`	`+`
	`32`	`+ print("Extracting Text.......")`
	`33`	`+ for i in range(1, filelimit + 1):`
	`34`	`+`
	`35`	`+ filename = "page_"+str(i)+".jpg"`
	`36`	`+`
	`37`	`+ mtext += str(((pytesseract.image_to_string(Image.open(filename)))))`
	`38`	`+`
	`39`	`+ # replacing the text like arg-ument (which are included in new line with hyphen with word)`
	`40`	`+ mtext = mtext.replace('-\n', '')`
	`41`	`+`
	`42`	`+ # Deleting Image files`
	`43`	`+ for i in range(1, filelimit + 1):`
	`44`	`+ filename = "page_"+str(i)+".jpg"`
	`45`	`+ os.remove(filename)`
	`46`	`+`
	`47`	`+ return mtext`
	`48`	`+`
	`49`	`+`
	`50`	`+def text2video(mtext, video_file, Pdf_file_name):`
	`51`	`+`
	`52`	`+ language = 'en'`
	`53`	`+`
	`54`	`+ # Converting text to audio`
	`55`	`+ myobj = gTTS(text=mtext, lang=language, slow=False)`
	`56`	`+`
	`57`	`+ myobj.save("output.mp3")`
	`58`	`+`
	`59`	`+ audio = MP3("output.mp3")`
	`60`	`+`
	`61`	`+`
	`62`	`+`
	`63`	`+ # duration of audio file in seconds`
	`64`	`+ audio_length = int(audio.info.length)`
	`65`	`+`
	`66`	`+`
	`67`	`+ videoclip = VideoFileClip(video_file)`
	`68`	`+`
	`69`	`+`
	`70`	`+ if int(videoclip.duration)>audio_length:`
	`71`	`+`
	`72`	`+ # Clipping orignal video according to the length of video`
	`73`	`+ videoclip = videoclip.subclip(0, audio_length)`
	`74`	`+`
	`75`	`+ background_music = mpe.AudioFileClip("output.mp3")`
	`76`	`+`
	`77`	`+ new_clip = videoclip.set_audio(background_music)`
	`78`	`+`
	`79`	`+ name_of_vdeo_file = Pdf_file_name.split(".pdf")[0]+"(video).mp4"`
	`80`	`+`
	`81`	`+ new_clip.write_videofile(name_of_vdeo_file)`
	`82`	`+ os.remove("output.mp3")`
	`83`	`+`
	`84`	`+`
	`85`	`+if __name__ == "__main__":`
	`86`	`+ # Getting name of pdf file`
	`87`	`+ PDF_file = input("Enter the name of Pdf file with extension:- ")`
	`88`	`+`
	`89`	`+ # Getting name of video file`
	`90`	`+ video_file = input("Enter the name of video File with extension:- ")`
	`91`	`+`
	`92`	`+ # Extracting Text from Pdf`
	`93`	`+ text = pdf2text(PDF_file)`
	`94`	`+`
	`95`	`+ # Converting text to video`
	`96`	`+ text2video(text, video_file, PDF_file)`

0 commit comments

Comments

(0)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 38b8ec3

File tree

3 files changed

3 files changed

`‎Pdf2Video_Converter/Readme.md‎`

`‎Pdf2Video_Converter/requirements.txt‎`

`‎Pdf2Video_Converter/script.py‎`

0 commit comments