As a method of a COMObject, getElementById is built by win32com dynamically.
On my computer, if url is http://ieeexplore.ieee.org/xpl/periodicals.jsp, it will be almost equivalent to
def getElementById(self):
return self._ApplyTypes_(3000795, 1, (12, 0), (), 'getElementById', None,)
If the url is www.baidu.com, it will be almost equivalent to
def getElementById(self, v=pythoncom.Missing):
ret = self._oleobj_.InvokeTypes(1088, LCID, 1, (9, 0), ((8, 1),),v
)
if ret is not None:
ret = Dispatch(ret, 'getElementById', {3050F1FF-98B5-11CF-BB82-00AA00BDCE0B})
return ret
Obviously, if you pass an argument to the first code, you'll receive a TypeError. But if you try to use it directly, namely, invoke ie.Document.getElementById(), you won't receive a TypeError, but a com_error.
Why win32com built the wrong code?
Let us look at ie and ie.Document. They are both COMObjects, more precisely, win32com.client.CDispatch instances. CDispatch is just a wrapper class. The core is attribute _oleobj_, whose type is PyIDispatch.
>>> ie, ie.Document
(<COMObject InternetExplorer.Application>, <COMObject <unknown>>)
>>> ie.__class__, ie.Document.__class__
(<class win32com.client.CDispatch at 0x02CD00A0>,
<class win32com.client.CDispatch at 0x02CD00A0>)
>>> oleobj = ie.Document._oleobj_
>>> oleobj
<PyIDispatch at 0x02B37800 with obj at 0x003287D4>
To build getElementById, win32com needs to get the type information for getElementById method from _oleobj_. Roughly, win32com uses the following procedure
typeinfo = oleobj.GetTypeInfo()
typecomp = typeinfo.GetTypeComp()
x, funcdesc = typecomp.Bind('getElementById', pythoncom.INVOKE_FUNC)
......
funcdesc contains almost all import information, e.g. the number and types of the parameters.
If url is http://ieeexplore.ieee.org/xpl/periodicals.jsp, funcdesc.args is (), while the correc funcdesc.args should be ((8, 1, None),).
Long story in short, win32com had retrieved the wrong type information, thus it built the wrong method.
I am not sure who is to blame, PyWin32 or IE. But base on my observation, I found nothing wrong in PyWin32's code. On the other hand, the following script runs perfectly in Windows Script Host.
var ie = new ActiveXObject("InternetExplorer.Application");
ie.Visible = 1;
ie.Navigate("http://ieeexplore.ieee.org/xpl/periodicals.jsp");
WScript.sleep(5000);
ie.Document.getElementById("browse_keyword").value = "Computer";
Duncan has already pointed out IE's compatibility mode can prevent the problem. Unfortunately, it seems it's impossible to enable compatibility mode from a script.
But I found a trick, which can help us bypass the problem.
First, you need to visit a good site, which gives us a HTML page, and retrieve a correct Document object from it.
ie = win32com.client.DispatchEx('InternetExplorer.Application')
ie.Visible = 1
ie.Navigate('http://www.haskell.org/arrows')
time.sleep(5)
document = ie.Document
Then jump to the page which doesn't work
ie.Navigate('http://ieeexplore.ieee.org/xpl/periodicals.jsp')
time.sleep(5)
Now you can access the DOM of the second page via the old Document object.
document.getElementById('browse_keyword').value = "Computer"
If you use the new Document object, you will get a TypeError again.
>>> ie.Document.getElementById('browse_keyword')
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
TypeError: getElementById() takes exactly 1 argument (2 given)