Need To Copy Source Code of CMS Page in Python Webdriver

Question 1

I am new to selenium 2.53.6 and Chrome webdriver so I may have overlooked something very simple. I need to copy the source code of a CMS page in Python 3.3.6 webdriver.

I have tried page_source and it doesn't do what I need to have done. However, I can get the page open in webdriver and it is showing the source code but I haven't been able to select the content and copy it to the clipboard.

I am on a Mac OS X 10.10.5 so I used:

ActionChains(driver).key_down(Keys.COMMAND).send_keys('a').key_up(Keys.COMMAND).perform()
ActionChains(driver).key_down(Keys.COMMAND).send_keys('c').key_up(Keys.COMMAND).perform()

But the only thing that happened was I got an "a" and a "c" passed to the page.

I have tried using the context menu of the finder and I can get it to show but can''t get it to select the "select all" option.

btn = driver.wait.until(EC.visibility_of_element_located((By.XPATH, "//textarea[@key='postBody']")))
actionChains = ActionChains(driver)
action = actionChains.context_click(btn).perform()
links = action.find_element(By.LINK_TEXT, "Select All")
links.click()

Using the above code I get this error:

Traceback (most recent call last):
File "expertsBrazil2webdriver.py", line 80, in <module>
links = action.find_element(By.LINK_TEXT, "Select All")
AttributeError: 'NoneType' object has no attribute 'find_element'

So please tell me a workable approach to get Webdriver to copy the content of the page to the clipboard.

Question 2

What do you want to do with the content? And is it from a textarea?

Question 3

The best solution that I tried was:

import sys 
from PyQt4.QtGui import * 
from PyQt4.QtCore import * 
from PyQt4.QtWebKit import * 
class Render(QWebPage): 
 def __init__(self, url): 
 self.app = QApplication(sys.argv) 
 QWebPage.__init__(self) 
 self.loadFinished.connect(self._loadFinished) 
 self.mainFrame().load(QUrl(url)) 
 self.app.exec_() 
 def _loadFinished(self, result): 
 self.frame = self.mainFrame() 
 self.app.quit() 
url = 'http://webscraping.com' 
r = Render(url) 
html = r.frame.toHtml().toUtf8().data()

Credit goes to: WebScraping

Question 4

I need to copy the source code that is displayed on the screen only. Using the url brings all of the cms code and not the content I need. This is blogger code and they seem to be calling another url for the content I need. This is evident when I have to use "element7 = driver.wait.until(EC.visibility_of_element_located((By.ID, "postingHtmlBox")))" because it has to be loaded before I can access it by Selenium.

Question 5

You can get the content of a text area with get_attribute("value")

Something like:

content = driver.find_element_by_xpath("//textarea[@key='postBody']").get_attribute("value")

Don't fiddle with trying to use the context menu's or operating system key combinations, this is brittle and the wrong direction to solve most issues.

Piotr Wicherski Piotr Wicherski 4992 silver badges13 bronze badges · Answer 1 · 2016-09-18 03:43:52Z

The best solution that I tried was:

import sys 
from PyQt4.QtGui import * 
from PyQt4.QtCore import * 
from PyQt4.QtWebKit import * 
class Render(QWebPage): 
 def __init__(self, url): 
 self.app = QApplication(sys.argv) 
 QWebPage.__init__(self) 
 self.loadFinished.connect(self._loadFinished) 
 self.mainFrame().load(QUrl(url)) 
 self.app.exec_() 
 def _loadFinished(self, result): 
 self.frame = self.mainFrame() 
 self.app.quit() 
url = 'http://webscraping.com' 
r = Render(url) 
html = r.frame.toHtml().toUtf8().data()

Credit goes to: WebScraping

I need to copy the source code that is displayed on the screen only. Using the url brings all of the cms code and not the content I need. This is blogger code and they seem to be calling another url for the content I need. This is evident when I have to use "element7 = driver.wait.until(EC.visibility_of_element_located((By.ID, "postingHtmlBox")))" because it has to be loaded before I can access it by Selenium.

score 0 · Answer 2 · 2018-08-13 19:53:08Z

You can get the content of a text area with get_attribute("value")

Something like:

content = driver.find_element_by_xpath("//textarea[@key='postBody']").get_attribute("value")

Don't fiddle with trying to use the context menu's or operating system key combinations, this is brittle and the wrong direction to solve most issues.

Stack Exchange Network

Need To Copy Source Code of CMS Page in Python Webdriver

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Need To Copy Source Code of CMS Page in Python Webdriver

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions