skip to main
|
skip to sidebar
Linux Info
Information about various Linux / UNIX / OS X topics
Saturday, November 19, 2011
ImageMatik
Convert all TIFs in a directory to JPEGs
mogrify -format jpg -quality 50 *.tiff
http://linuxuser32.wordpress.com/2007/06/16/batch-image-convert-scale-thumbnail-jpegs-pdf/
Thursday, November 10, 2011
Python: lxml insert element before and after given element
Based on:
http://stackoverflow.com/questions/7474972/python-lxml-append-element-after-another-element
from lxml import etree
xml = etree.parse('old_xml.xml')
count = 1
for pp in xml.xpath('/response/items/item'):
prod_id = etree.Element('new_element')
# this element will be inserted
prod_id.text = "%s" % count
nav = pp.find('name')
# we will insert before/after this element
div = nav.getparent()
pp.insert(pp.index(nav), prod_id)
# insert before 'name'
pp.insert(pp.index(nav) + 1, prod_id)
# insert after 'name'
count += 1
out = open('new_xml.xml','w')
out.write(etree.tostring(xml, pretty_print=True))
out.close()
Python: Detect Charset and convert to Unicode
Converting data to Unicode via chardet
http://stackoverflow.com/questions/2686709/encoding-in-python-with-lxml-complex-solution
import chardet
from lxml import html
f = open('my.xml')
content = f.read()
f.close()
encoding = chardet.detect(content)['encoding']
if encoding != 'utf-8':
content = content.decode(encoding, 'replace').encode('utf-8')
Converting Data to Unicode via UnicodeDammit
http://lxml.de/elementsoup.html
from BeautifulSoup import UnicodeDammit
converted = UnicodeDammit(content)
if not converted.unicode:
raise UnicodeDecodeError(
"Failed to detect encoding, tried [%s]",
', '.join(converted.triedEncodings))
# print converted.originalEncoding
return converted.unicode
Credit goes to Ian Bicking and others on the lxml team
Tuesday, November 08, 2011
Using lxml xpath to get elements with a default namespace
xml_string="""
<a xmlns="http://www.w3.org/1999/xhtml">
<b>
<r>
<Value>string</Value>
</r>
<r>
<Value>string</Value>
</r>
</b>
<e>
<string>string1</string>
<string>string2</string>
</e>
</a>
"""
from lxml import etree
xml = etree.fromstring(xml_string)
xml.xpath('/w3:a/w3:b/w3:r/w3:Value/text()', namespaces={'w3':'http://www.w3.org/1999/xhtml'})
Newer Posts
Older Posts
Home
Subscribe to:
Posts (Atom)
Blog Archive
►
2024
(1)
►
August
(1)
►
2017
(2)
►
November
(1)
►
June
(1)
►
2016
(4)
►
June
(1)
►
March
(3)
►
2015
(15)
►
December
(2)
►
November
(2)
►
August
(1)
►
July
(3)
►
June
(1)
►
May
(3)
►
April
(1)
►
March
(2)
►
2014
(11)
►
September
(7)
►
April
(1)
►
March
(1)
►
January
(2)
►
2013
(17)
►
December
(1)
►
November
(1)
►
October
(1)
►
July
(3)
►
May
(3)
►
April
(2)
►
March
(2)
►
February
(4)
►
2012
(11)
►
November
(1)
►
October
(1)
►
April
(2)
►
March
(5)
►
February
(2)
▼
2011
(26)
▼
November
(4)
ImageMatik Convert all TIFs in a directory to JP...
Python: lxml insert element before and after gi...
Python: Detect Charset and convert to Unicode C...
Using lxml xpath to get elements with a default n...
►
September
(2)
►
August
(1)
►
July
(1)
►
June
(1)
►
May
(3)
►
April
(8)
►
March
(1)
►
February
(2)
►
January
(3)
►
2010
(30)
►
December
(1)
►
November
(3)
►
October
(3)
►
August
(8)
►
July
(3)
►
June
(8)
►
May
(1)
►
April
(1)
►
February
(1)
►
January
(1)
►
2009
(18)
►
December
(2)
►
October
(4)
►
September
(2)
►
August
(2)
►
July
(1)
►
May
(1)
►
February
(2)
►
January
(4)
►
2008
(12)
►
December
(1)
►
October
(2)
►
September
(1)
►
August
(2)
►
July
(2)
►
June
(1)
►
April
(2)
►
February
(1)
►
2007
(17)
►
November
(2)
►
September
(1)
►
August
(1)
►
April
(4)
►
March
(3)
►
February
(3)
►
January
(3)
►
2006
(16)
►
December
(9)
►
November
(6)
►
April
(1)
►
2005
(3)
►
March
(3)
Links
Joe Jasinski
Google News
Jaz Studios
About Me
Joe Jasinski
View my complete profile