Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

View Challenge

Posted May 11, 2015 3:26am by

create a variable named contacts that is an re.search

"Create a new variable named contacts that is an re.search() where the pattern catches the email address and phone number from string. Name the email pattern email and the phone number pattern phone. The comma and spaces * should not* be part of the groups."

emails.py

import re

string = '''Love, Kenneth, kenneth+challenge@teamtreehouse.com, 555-555-5555, @kennethlove
Chalkley, Andrew, andrew@teamtreehouse.co.uk, 555-555-5556, @chalkers
McFarland, Dave, dave.mcfarland@teamtreehouse.com, 555-555-5557, @davemcfarland
Kesten, Joy, joy@teamtreehouse.com, 555-555-5558, @joykesten'''

contacts = re.search(r^
    (?P<email>[-\w\d.+]+@[-\w\d.]+) 
     ,\s
     (?P<phone>\d\{3}-\d\{3}-\d{4}$
     , string)

6 Answers

May 11, 2015 3:37am

A few thing needed cleaning up:

contacts = re.search(r''' #<-- Wrap in multiline quote. Remove leading caret (^)
    (?P<email>[-\w\d.+]+@[-\w\d.]+) 
     ,\s
     (?P<phone>\d{3}-\d{3}-\d{4})''' #<-- Add closing paren. Remove extra backslashes. Remove trailing $
     , string, re.X | re.M) #<-- Add verbose (for multiline quote), and multiline (for muliple lines in string) flags

May 11, 2015 3:36pm

Thank you Chris, When do i get to use the (^) caret?

May 11, 2015 3:46pm

The caret is used to anchor the pattern to the beginning of the line. The current pattern could have also filled in with match anything from the beginning of the line:

contacts = re.search(r''' #<-- Wrap in multiline quote. Remove leading caret (^)
    ^.*  #<-- match Beginning if line followed by anything
    (?P<email>[-\w\d.+]+@[-\w\d.]+) 
     ,\s
     (?P<phone>\d{3}-\d{3}-\d{4})''' #<-- Add closing paren. Remove extra backslashes. Remove trailing $
     , string, re.X | re.M) #<-- Add verbose (for multiline quote), and multiline (for muliple lines in string) flags

July 8, 2015 3:39pm

Is there any way you can better clarify when the ^ and $ are needed and when it is not. I am unable to follow the post below.

July 8, 2015 4:38pm

When developing a regular expression, you are defining pattern that is compared to a string or portion of the string to find a match. In a very simple sense, think of aligning the pattern starting at each character of the string checking for a match. The pattern is shifted one character at a time rechecking for a match in each case.

The special characters of caret (^) and dollar sign ($) represent specific alignment instructions for the pattern.

The caret indicates the pattern comparison must start at the first character of the searched string (or the beginning of the line and a multi-line string). The dollar sign indicates the pattern comparison must end at the last character of the searched string (or the end of the line in a multi-line string). These special characters can be thought of as anchors to tie the pattern to a specific position.

There are many reasons to choose to use these anchoring characters:

If your search pattern may match multiple places within a string, an anchor can help it find the first or last match occurrence.
Using both caret and $ you can force your pattern to consume the entire searched string.
Using an anchor can improve performance of your regex by reducing the amount of back traces done during pattern matching.

I'm sure there are others I've left off.

The caret and dollar sign are only used if your regex pattern is required to include the start or end of the searched string or line in a muli-line sting.

April 12, 2018 9:32am

I am confused that why in the pattern for email there are "-" in both [], it's necessary for email, right? but I still don't know why....

April 12, 2018 5:54pm

The hyphen character is valid in usernames and in domain names so it needs to be included. In regex, a hyphen is also used in specifying a range of characters, like 0-9. To include a literal hyphen as a matching character, it needs to be listed first when in a character set so it's not confusing as part of a range.

Keep in mind the challenge is using simplified domain names. The full regex check for a valid domain name is very complicated.

Packages such as Django have built-in URL validators.

April 1, 2020 9:15am

Hi Chris,

I had the same solution as this except for phone number I had: ([\d]{3}-[\d]{3}-[\d]{4})

how come this didn't work?

April 1, 2020 3:25pm

Timothy Tseng, I’ve added the character set notation and it also works. Perhaps there is some other error. Could you include your full code?

Did you include the group name P<phone>?

October 25, 2016 10:40am

This is my way for this challenge, it passed:

contacts = re.search(r'''
    ^[\w]+,\s[\w]+,\s
    (?P<email>[\w+.]*@[\w.]*)
    ,\s
    (?P<phone>\d{3}-\d{3}-\d{4})
    ,\s
    @[\w]+$
''', string, re.X|re.I|re.M)

twitters = re.search(r'''
    @[\w]+$
''', string, re.X|re.M)

Is there anyway to be improved?

May 11, 2015 3:49pm

Got it !

August 25, 2018 12:17am

contacts = re.search(r''' (?P<email>[.+\w]+@[.\w]+) #email ,\s (?P<phone>\d{3}-\d{3}-\d{4}) #phone ''', string, re.X|re.M)

Here is my answer. My question is why do I need to add the ",\s" if I do not want to capture it?

August 27, 2018 2:06am

The pattern will only match text exactly. So the “,\s” allows for matching the comma-space and content surrounding it, but since the additional pattern is outside of a group it won’t be captured.

June 10, 2019 9:45am

contacts = re.search(r''' ^[\w]+,\s[\w]+,\s (?P<email>[\w+.]@[\w.]) ,\s (?P<phone>\d{3}-\d{3}-\d{4}) ,\s @[\w]+$ ''', string, re.X|re.I|re.M)

twitters = re.search(r''' @[\w]+$ ''', string, re.X|re.M)

November 11, 2021 7:40pm

import re

contacts = re.search(r''' (?P<email>[.+\w]+@[.\w]+),\s (?P<phone>\d{3}-\d{3}-\d{4}) #phone ''', string, re.X|re.M)

twitters = re.search(r''' @[\w]+$ ''', string, re.X|re.M)

Posting to the forum is only allowed for members with active accounts.
Please sign in or sign up to post.

Welcome to the Treehouse Community

Looking to learn something new?

Wolverine .py

Wolverine .py

create a variable named contacts that is an re.search

6 Answers

Chris Freeman

Chris Freeman

Wolverine .py

Wolverine .py

Chris Freeman

Chris Freeman

Hunter Kiely

Hunter Kiely

Chris Freeman

Chris Freeman

Nate Yu

Nate Yu

Chris Freeman

Chris Freeman

Timothy Tseng

Timothy Tseng

Chris Freeman

Chris Freeman

Bright Zhao

Bright Zhao

Wolverine .py

Wolverine .py

Christopher Gunawan

Christopher Gunawan

Chris Freeman

Chris Freeman

Tapiwanashe Taurayi

Tapiwanashe Taurayi

Kalkidan Abebe

Kalkidan Abebe