unicode - Python TypeError using the translate() method -
i trying build function in python pulls specific characters out of string, , returns each of remaining words on separate line. apostrophes must removed -- , contraction must split, second half moving new line.
for instance, have sentence fragment:
", doesn't mean him."
and want remove these punctuation characters:
",'."
it should return:
that doesn t mean him
here function i've written:
def remove_chars(frag, punc): if "'" in frag: frag = frag.replace("'", " ") frag = frag.translate(none, punc) frag = frag.split(" ") in frag: print remove_chars(", doesn't mean him.", ",'.")
and here's error i'm receiving:
typeerror: deletions implemented differently unicode
thanks in advance this.
the unicode.translate()
method indeed different str.translate()
method. takes one argument, dictionary mapping integer codepoint values other values. delete, other value should none
.
you can trivially create such dictionary using dict.fromkeys()
:
mapping = dict.fromkeys(map(ord, punc)) frag = frag.translate(mapping)
since keys must integers, used ord
map each character in string punc
corresponding integer codepoint. dict.fromkeys()
creates dictionary integer keys , gives each of them default value none
.
demo:
>>> punc = ",'." >>> dict.fromkeys(map(ord, punc)) {44: none, 46: none, 39: none} >>> mapping = dict.fromkeys(map(ord, punc)) >>> u", doesn't mean him.".translate(mapping) u' doesnt mean him'
Comments
Post a Comment