Welcome to the Treehouse Community
Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.
Looking to learn something new?
Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.
Start your free trialDan A
Courses Plus Student 4,036 Pointsregex differences between Python 2.7.8 and 3.4
I have been doing all the Python courses here in 2.7.8 due to my 'next big thing' application requirements. So far I've been able to suss out differences as I come across them on my own. However, I am stumped by this one:
Following along with the video and manipulating email addresses the following code works in 3.4 (like in the video) but I get wonky matches in 2.7.8
print(re.findall(r'''
\b@[-\w\d.]* # match a word boundry, an 'at' symbol, and then any number characters
[^gov\t]+ # Ignore 1+ instances of the ketters 'g', 'o', or 'v' and a tab
\b # match another word boundry
''', data, re.VERBOSE | re.I))
And the results in 2.7.8:
[u'@teamtreehouse.com (555) 555-5555 Teacher, ', u'@teamtreehouse.com (555) 555-5554 Teacher, ', u'@camelot.co.uk ', u'@norrbotten.co.se ', u'@killerrabbit.com Enchanter, Killer Rabbit ', u'@teamtreehouse.com (555) 555-5543 ', u'@tardis.co.uk Time ', u'@example.com 555-555-5552 Example, Example ', u'@us.gov 555 555-5551 President, United States ', u'@teamtreehouse.com (555) 555-5553 Teacher, ', u'@empire.gov (555) 555-4444 Sith ', u'@spain.gov First Deputy Prime Minister, Spanish ']
Am I doing something weird? Or are there some differences I should know about. (A Google search didn't help me)
Thanks
4 Answers
Kenneth Love
Treehouse Guest TeacherHmm. I just ran your line against my local Python 2.7.8 and got:
[u'@teamtreehouse.com', u'@teamtreehouse.com', u'@camelot.co.uk', u'@norrbotten.co.se', u'@killerrabbit.com', u'@teamtreehouse.com', u'@tardis.co.uk', u'@example.com', u'@us.', u'@teamtreehouse.com', u'@empire.', u'@spain.']
as the output. That looks like what I was expecting with 3.4, too. This is using the codecs
code, too.
Kenneth Love
Treehouse Guest TeacherI wonder if the space between re.VERBOSE
and re.I
is causing the issue?
Dan A
Courses Plus Student 4,036 PointsIt doesn't seem to make a difference when I take the spaces out. I put them in b/c I was getting a pep "E227 Missing whitespace around bitwise or shift operator " warning. (some of those messages are rather annoying)
Kenneth Love
Treehouse Guest TeacherWhy would that generate PEP 8 warnings? The |
(the bitwise operator it's referring to) is inside of a function call, it shouldn't have spaces around it. Weird. But if it doesn't change the output, don't worry about it.
Dan A
Courses Plus Student 4,036 PointsI have tested the regex expression in a few web based regex testers and it seems they are also using the 2.7 interpreter as I am getting the same error results as on my local machine.
I should note that there is a difference in how I open the file using 2.7.8. In order to read unicode characters, I need to use the codecs module (or is it a package or library?). Maybe something is happening with the file handling that is causing the discrepancy. Here's the code:
import codecs
import re
names_file = codecs.open('names.txt', encoding='utf-8')
data = names_file.read()
names_file.close()
As a side question, is this the proper way to open and read a file that has unicode characters?
Jonathan Mitten
Courses Plus Student 11,197 PointsI think I've solved this and others' issues with us using our console. I suspect the issue is in copying and pasting the names.txt
file into a text editor that replaces tabs with spaces. My set up for editing Python in Sublime Text 3 swaps tabs with 4x spaces, rendering some of the regex rules invalid.
Instead of copying the text from the workspace, instead download the workspace and move the names.txt
file into your working directory (overwrite the current file if it's still in there).
Try your regex as the movies have you do it, and see if they work as Kenneth Love says they should.
Dan A
Courses Plus Student 4,036 PointsDan A
Courses Plus Student 4,036 PointsSorry for the delay. Thanks for looking into this! This regex course has been very helpful!