Welcome to the Treehouse Community
Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.
Looking to learn something new?
Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.
Start your free trialQuinton Dobbs
5,149 PointsWorkspaces and my terminal are giving different output for the same code
Whenever I enter the code below in Workspaces I get the same output that Kenneth gets in the video and Workspaces is running Python 3.5.0 at the time. But, whenever I run the same code in my terminal (Which is running Python 3.6.3) I get the following output:
My output:
['@teamtreehouse.com (555) 555-5555 Teacher, ', '@teamtreehouse.com (555) 555-5554 Teacher, ', '@camelot.co.uk ', '@norrbotten.co.se ', '@killerrabbit.com Enchanter, Killer Rabbit ', '@teamtreehouse.com (555) 555-5543 ', '@tardis.co.uk Time ', '@example.com 555-555-5552 Example, Example ', '@us.gov 555 555-5551 President, United States ', '@teamtreehouse.com (555) 555-5553 Teacher, ', '@empire.gov (555) 555-4444 Sith ', '@spain.gov First Deputy Prime Minister, Spanish ']
Kenneth's output:
['@teamtreehouse.com', '@teamtreehouse.com', '@camelot.co.uk', '@norrbotten.co.se', '@killerrabbit.com', '@teamtreehouse.com', '@tardis.co.uk', '@example.com', '@us.', '@teamtreehouse.com', '@empire.', '@spain.']
Could this be because of the different versions of Python? If so how would you change it to get the same output as Kenneth?
import re
name_file = open("names.txt", encoding="utf-8")
data = name_file.read()
name_file.close()
#print(re.match(r"Love", data))
#print(re.search(r"Kenneth", data))
#print(re.findall(r"\(?\d{3}\)?-?\s?\d{3}-\d{4}", data))
#print(re.findall(r"\w*, \w+", data))
#print(re.findall(r"[-\w\d+.]+@[-\w\d.]+", data))
#print(re.findall(r"\b[trehous]{9}\b", data, re.I))
print(re.findall(r'''
\b@[-\w\d.]* # find word boundry after @ with any number of word characters
[^gov\t]+ # ignore one or more instances of gov
\b # end of word boundry
''', data, re.VERBOSE | re.I))
2 Answers
Jonathan Mitten
Courses Plus Student 11,197 PointsActually, I think I've solved this and others' issues with us using our console. I suspect the issue is in copying and pasting the names.txt
file into a text editor that replaces tabs with spaces. My set up for editing Python in Sublime Text 3 swaps tabs with 4x spaces, rendering some of the regex rules invalid.
Instead of copying the text from the workspace, see if downloading the workspace and moving the names.txt
file into your working directory (overwriting the current file if it's still in there).
Try your regex as the movies have you do it, and see if they work as Kenneth Love says they should.
Jonathan Mitten
Courses Plus Student 11,197 PointsI'm experiencing the same issue. However, my local Python version is 3.5.4. Workspace version is 3.5.0.
At first, I thought it could be the way I was opening and reading the file... which I was at first doing using the with open
technique. I changed it to how @KennithLove does it in the course video, to :
Before you read further, check my next answer below this one. I'm curious if this resolves most of these issues. It did mine.
names_file = open("names.txt", encoding="utf-8")
data = names_file.read()
names_file.close()
Locally,
print(re.findall(r'''
\b@[-\w\d.]*
[^gov\t]+
\b
''', data, re.VERBOSE|re.I))
results in:
['@teamtreehouse.com (555) 555-5555 Teacher, ', '@teamtreehouse.com (555) 555-5554 Teacher, ', '@camelot.co.uk ', '@norrbotten.co.se ', '@killerrabbit.com Enchanter, Killer Rabbit ', '@teamtreehouse.com (555) 555-5543 ', '@tardis.co.uk Time ', '@example.com 555-555-5552 Example, Example ', '@us.gov 555 555-5551 President, United States ', '@teamtreehouse.com (555) 555-5553 Teacher, ', '@empire.gov (555) 555-4444 Sith ', '@spain.gov First Deputy Prime Minister, Spanish ']
but on the the workspace,
>>> print(re.findall(r'''
... \b@[-\w\d.]*
... [^gov\t]+
... \b
... ''', data, re.VERBOSE|re.I))
results in:
['@teamtreehouse.com', '@teamtreehouse.com', '@camelot.co.uk', '@norrbotten.co.se', '
@killerrabbit.com', '@teamtreehouse.com', '@tardis.co.uk', '@example.com', '@us.', '@
teamtreehouse.com', '@empire.', '@spain.']
After poking around in these community boards, seeing at least one other student with the same problem, I took the issue to the regex101 website, and it agrees with our regex engines:
https://regex101.com/r/Bf4Xz4/1
Playing with the regex options (far right of the regular expression field), I changed the option to include "ungreedy", and looked that up on the Python docs, here: https://docs.python.org/3/library/re.html , where it says
*?, +?, ??
The '*', '+', and '?' qualifiers are all greedy; they match as much text as possible. Sometimes this behaviour isnβt desired; if the RE <.*> is matched against '<a> b <c>', it will match the entire string, and not just '<a>'. Adding ? after the qualifier makes it perform the match in non-greedy or minimal fashion; as few characters as possible will be matched. Using the RE <.*?> will match only '<a>'.
I put a space between gov
and \t
, so it reads ^[gov \t]+
and now I'm getting the same result as Kenneth Love . https://regex101.com/r/9F8wTZ/1, it output in the workspace is different! https://w.trhou.se/qs8fllit1m