Welcome to the Treehouse Community
Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.
Looking to learn something new?
Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.
Start your free trialIskander Ismagilov
13,298 PointsDo not show email with .gov
How should pattern be chaged to do not show emails with .gov at the end from name.txt file?
print(findall(r"\b@[-\w\d.]*[^gov\t]+\b", data, I))
1 Answer
Chris Freeman
Treehouse Moderator 68,441 PointsYou can use a negative lookbehind assertion of the format (?<!...) where ...
is the pattern.
First isolate the pattern to match the TopLevelDomain. A "dot" needs to be outside the character class to mark the transition. This will match all email addresses:
print(re.findall(r"@[-\w\d.]+\.[\w\d]+\b", data, re.I))
Now add the negative lookbehind to say "at this point in the pattern the previous characters can not be "gov":
print(re.findall(r"@[-\w\d.]+\.[\w\d]+(?<!.gov)\b", data, re.I))
Note the leading \b
is redundant because the @
is already a word boundary.
Iskander Ismagilov
13,298 PointsIskander Ismagilov
13,298 PointsThank you Chris.