Python 2/3 and unicode file paths

This bug popped up in a script of mine:

For Python 2:

>>> os.path.abspath('.')
'C:\\Users\\yuv\\Desktop\\YuvDesktop\\??????'
>>> os.path.abspath(u'.')
u'C:\\Users\\yuv\\Desktop\\YuvDesktop\\\u05d0\u05d1\u05d2\u05d3\u05d4\u05d5'

For Python 3:

>>> os.path.abspath('.')
'C:\\Users\\yuv\\Desktop\\YuvDesktop\\\u05d0\u05d1\u05d2\u05d3\u05d4\u05d5'
>>> os.path.abspath(b'.')
b'C:\\Users\\yuv\\Desktop\\YuvDesktop\\??????'

That odd set of question marks is a completely useless and invalid path in case you were wondering. The windows cmd prompt sometimes has question marks that aren’t garbage, but I assure you, these are useless and wrong question marks.

The solution is to always use unicode strings with path functions. A bit of a pain. Am I the only one who thinks this is failing silently? I’ll file it in the bug tracker and we’ll see.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s