Here is a code snippet whereby, it would load up the file object into memory, and get the text between a tag. For example a tag such as:
<school>University of Helsinki</school>
contained in a example file /tmp/myinfo.xml I would now run make use of the snippet like:
file_object = open('/tmp/myinfo.xml') print parse(file_object, 'school') file_object.close()
from xml.dom import minidom def parse(infile, tag): output = '' xmldoc = minidom.parse(infile) grab = xmldoc.getElementsByTagName(tag) for data in grab: if _debug: print data.toxml() childNodes = data.childNodes for node in childNodes: output += node.data return output + '\n'
Notice I iterated through “grab”, as in XML the earlier piece of xml could have be written like:
<school> University of Helsinki </school>