Question

(Python) Develop a crawler that collects the email addresses in the visited web pages. You can...

(Python) Develop a crawler that collects the email addresses in the visited web pages. You can write a function emails() that takes a document (as a string) as input and returns the set of email addresses (i.e., strings) appearing in it. You should use a regular expression to find the email addresses in the document. Explain the code and how it works.

0 0
Add a comment Improve this question Transcribed image text
Answer #1

Program: File name: Main.py #import necessary header files import re from urllib.request import urlopen wwwwo wwww m emails (

#Assign the webpage url = www http://www.cdm.depaul.edu #Assign the content into the variable document document = urlopen

Editable code:

#import necessary header files

import re

from urllib.request import urlopen

#Define the function "emails()"

def emails(document):

#Declare the list

l1 =[]

#For loop to print the list of email addresses

for email in re.findall(r'[\w\.-]+@[\w\.-]+', document):

#Check, email is not in list

if not email in l1:

#Then, add the email to the list "l1"

l1.append(email)

#Print the list "l1"

print(l1)

#Assign the webpage

url = 'http://www.cdm.depaul.edu'

#Assign the content into the variable "document"

document = urlopen(url).read().decode()

#Call the function "emails()"

emails(document)

Add a comment
Know the answer?
Add Answer to:
(Python) Develop a crawler that collects the email addresses in the visited web pages. You can...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • Develop a crawler that collects the email addresses in the visited web pages. You can write...

    Develop a crawler that collects the email addresses in the visited web pages. You can write a function emails() that takes a document (as a string) as input and returns the set of email addresses (i.e., strings) appearing in it. You should use a regular expression to find the email addresses in the document. Explain the code and how it works.

  • Preliminaries For this lab you will be working with regular expressions in Python. Various functions for...

    Preliminaries For this lab you will be working with regular expressions in Python. Various functions for working with regular expressions are available in the re module. Fortunately, Python makes it pretty easy to check if a string matches a particular pattern. At the top of the file we must import the re module: import re Then we can use the search() function to test whether a string matches a pattern. In the example below, the regular expression has been saved...

  • Python! Input: For this first part you will be given a set of strings that represents a fully par...

    python! Input: For this first part you will be given a set of strings that represents a fully parenthesized infix expression that eventually (next assignment) be differentiated. The expression will be composed of single digit integers ("A"-"E"), the variable "X", parenthesis "(" and ")', and the binary operators +, -, *, /, and ^ (exponentiation). No spaces. Process: Generate a binary parse tree from the given input. Output: (1) Echo print the input string. (2) Print out the parse tree...

  • Hello, In Python, I am trying to create a function that will return the domain name...

    Hello, In Python, I am trying to create a function that will return the domain name from any inputted email address. For example, my input of [email protected] would return just the domainname. It needs to be done using regular expressions and the map function. I'm still stuck on trying to get the correct parameters for regular expression and also what is throwing it off. Thank you for your help! I am currently working with a list of different email addresses...

  • For Python, I am a little confused on the use of reduce,map, and lambda. My assignment wants me t...

    For Python, I am a little confused on the use of reduce,map, and lambda. My assignment wants me to not use .join() and write a SINGLE LINE function to join each letter in a list with another string.(See problem below). Using reduce, map, and lambda, write the single-expression function myJoin(L, sep) that takes a non-empty list L and a string sep, and without calling sep.join(L), works roughly the same as that returning a single string with the values in L...

  • Python: Suppose you develop a code that replaces a string with a new string that consists...

    Python: Suppose you develop a code that replaces a string with a new string that consists of all the even indexed characters of the original followed by all the odd indexed characters. For example, the string 'computers' would be encoded as 'cmuesoptr'. Write a function encode(word) that returns the encoded version of the string named word.

  • Python 3 coding with AWS/Ide. Objective: The purpose of this lab is for you to become familiar with Python’s built-in te...

    Python 3 coding with AWS/Ide. Objective: The purpose of this lab is for you to become familiar with Python’s built-in text container -- class str-- and lists containing multiple strings. One of the advantages of the str class is that, to the programmer, strings in your code may be treated in a manner that is similar to how numbers are treated. Just like ints, floats, a string object (i.e., variable) may be initialized with a literal value or the contents...

  • Can I see that source code in python Write the following value-returning function: encoded - Takes...

    Can I see that source code in python Write the following value-returning function: encoded - Takes a string as parameter and returns the string with each vowel substituted by its corresponding number as follows: a -1 e-2 0-4 。u_5 The function MUST USE string slicing Write a main program that will take an input string (text) from command line and it will display the input text encoded. You will call the encoded function to get the text encoded.

  • Python Coding The script, `eparser.py`, should: 1. Open the file specified as the first argument to...

    Python Coding The script, `eparser.py`, should: 1. Open the file specified as the first argument to the script (see below) 2. Read the file one line at a time (i.e., for line in file--each line will be a separate email address) 3. Print (to the screen) a line that "interprets" the email address as described below 4. When finished, display the total number of email addresses and a count unique domain names Each email address will either be in the...

  • Please solve in Python. You would like to set a password for an email account. However,...

    Please solve in Python. You would like to set a password for an email account. However, there are two restrictions on the format of the password. It has to contain at least one uppercase character and it cannot contain any digits. You are given a string S consisting of N alphanumerical characters. You would like to find the longest substring of Sthat is a valid password. A substring is defined as a contiguous segment of a string. For example, given...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT