India PyCon 2009
Quick wallpaper changer
Load testing with Grinder
Adding namespace to XML
Opera RSS to OPML
« Interesting software from Freshmeat
» Uniform civil code
The fact that I had to write this shows how bad the state of software UI design is Anyway, if anybody else is in a screwed up situation of having to automate IE, this might help you.
def download_url_with_ie(url): """ Given a url, it starts IE, loads the page, gets the HTML. Works only in Win32 with Python Win32com extensions enabled. Needs IE. Why? If you’re forced to work with Brain-dead closed source applications that go to tremendous length to deliver output specific to browsers; and the application has no interface other than a browser; and you want get data into a CSV or XML for further analysis; Note: IE internally formats all HTML to stupid mixed-case, no- quotes-around-attributes syntax. So if you are planning to parse the data, make sure you study the output of this function rather than looking at View-source alone. """ #if you are calling this function in a loop, it is more #efficient to open ie once at the beginning, outside this #function and then use the same instance to go to url’s from win32com.client import Dispatch from time import sleep ie = Dispatch("InternetExplorer.Application") ie.Visible = 1 #make this 0, if you want to hide IE window #IE started ie.Navigate(url) #it takes a little while for page to load if ie.Busy: sleep(2) #now, we got the page loaded and DOM is filled up #so get the text text = ie.Document.body.innerHTML #text is in unicode, so get it into a string text = unicode(text) text = text.encode('ascii','ignore') #save some memory by quitting IE! **very important** ie.Quit() return text
Is there any way to do this with urllib? Or does it have to be by automating IE to get the IE-specific formatting for the page?
I couldn't figure out how to do that with urllib or urllib2. Even if the header is set to mimic IE, urllib adds a prefix of "Python..." - I guess the logic in t
he app is to see if browser agent string matches IE's exactly.
Do you have any idea how to post form data using IE?
This is the error I get when trying your recipe, any workaround?
File "D:\doc\getsites\g2.py", line 115, in download_url_with_ie
ie.Navigate(url)
File "", line 2, in Navigate
pywintypes.com_error: (-2147352567, 'Exception occurred.', (0, None, None, None,
0, -2147467259), None)
Do you know of a good reference to know of the com capabiblity of IE?
WebBrowser control is documented on MSDN at http://msdn.microsoft.com/workshop/browser/prog_browser_node_entry.asp
Internet Explorer is at http://msdn.microsoft.com/workshop/browser/webbrowser/reference/objects/internetexplorer.asp
hi Babu,
I am new to jython. I need some help in automating the IE through Jython. I was going through your website which is pretty cool, i found the following piece of code.
My requirement is just to close the IE window through Jython. I can see the statement "ie.Quit()" doing that. But what is that win32com.client. I would really appreciate your prompt response.
#7 - Native python has a wrapper around Win32 libraries. With that we can use ActiveX components similar to how we use them from Microsoft scripting environment. However, Jython is Python implemented on top of Java. If you can figure out how to access ActiveX components from Java (that would be JNI, right?), you may be able to access it through Jython. I am not very sure though.
Hi Babu,
While I try to print the html file using Acrobat Distiller to convert to postscript it prompts for filename dialog to enter the .
Is there a way to suppress the dialog and set the output postscript file programatically. Also is there a way to activate the printer to Acrobat distiller if that is not the default printer on which I run my script.
I am new to IE objects.
Right now I tried using
.ExecWB 6, -1
Please help
Thanks
Abhi.