Warning: touch(): Unable to create file /home/teethgrindercouk-296/public_html/prod/wp-content//fonts/99f2fb3ddbfdf5d5b69c937b9142a2a3.css because No such file or directory in /home/teethgrindercouk-296/public_html/prod/wp-admin/includes/class-wp-filesystem-direct.php on line 516

Normalize URL path python

I had a problem with some URLs that we found on someones website. They looked like this:

<a href=”http://example.com/../../../path/page.html”>, here is the same link: test. Notice that when you mouse over it, Firefox normalizes the URL so it looks correct.

Using urlparse in Python:

print urlparse.urljoin( ‘http://site.com/’, ‘/path/../path/.././path/./’ )

How, poopy.

So, we need to do better than that. The os module has a path notmalizer in it, but this would go squiffy when run on a Windows box because it will translate all the / to \. But there is a posixpath.py module which we can use:

import urlparse
import posixpath

def join(base,url):
join = urlparse.urljoin(base,url)
url = urlparse.urlparse(join)
path = posixpath.normpath(url[2])
return urlparse.urlunparse(

# strange paths
print join( ‘http://site.com/’‘/path/../path/.././path/./’ )
print join( ‘http://site.com/path/x.html’‘/path/../path/.././path/./y.html’ )

# paths that .. up too far
print join( ‘http://site.com/’‘../../../../path/’ )
print join( ‘http://site.com/x/x.html’‘../../../../path/moo.html’ )

# how path and base path combine
print join( ‘http://site.com/99/x.html’‘1/2/3/moo.html’ )
print join( ‘http://site.com/99/x.html’‘../1/2/3/moo.html’ )

This prints:


Cool, huh?