How to use BeautifulSoup for Webscraping

0 votes

I am trying to scrape all the subject titles of all the forum posts on this website. I am not sure how to go about this as the HTML format of the forum website is not what I am familiar with.

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = 'http://thailove.net/bbs/board.php?bo_table=ent'

uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

page_soup = soup(page_html, "html.parser")

#I don't think this is correct, but not sure on how else to to do this...
containers = page_soup.findAll("td",{"class":"td_subject"})


for container in containers:
subject = container.a.font.font.contents
#similarly not sure this is correct     
print("subject: ", subject)

Please let me know what I should do. Also keep in mind that the website is in Korean but can be easily translated into English if need be.

Sep 6, 2018 in Python by bug_seeker
• 15,510 points
2,177 views

1 answer to this question.

0 votes

Your code is good until you get to the for loop, you should be acessing container.a.contents[0]to get the subjects, and the print function should be inside your for loop:

for container in containers:
    subject = container.a.contents[0]
    print("subject: ", subject)
answered Sep 6, 2018 by Priyaj
• 58,020 points

Related Questions In Python

+2 votes
2 answers

How to use BeatifulSoup for webscraping?

your programme is fine until you start ...READ MORE

answered Apr 4, 2018 in Python by charlie_brown
• 7,720 points
1,073 views
0 votes
0 answers
0 votes
2 answers

How to use in python for loop not equal marks? example: a!=0

Hello @Azizjon, You can go through the example ...READ MORE

answered Oct 12, 2020 in Python by Gitika
• 65,770 points
2,272 views
0 votes
1 answer

Raw_input method is not working in python3. How to use it?

raw_input is not supported anymore in python3. ...READ MORE

answered May 5, 2018 in Python by aayushi
• 750 points
3,367 views
0 votes
1 answer

How to download intext images with beautiful soup

Try this: html_data = """ <td colspan="3"><b>"Assemble under ...READ MORE

answered Sep 10, 2018 in Python by Priyaj
• 58,020 points
5,453 views
0 votes
1 answer

How to download intext images with beautiful soup

Ohh... I got what you need. Try this: html_data ...READ MORE

answered Sep 20, 2018 in Python by Priyaj
• 58,020 points
5,566 views
0 votes
1 answer

Get all the read more links of amazon.jobs with Python

As you've noticed your request returns only ...READ MORE

answered Sep 28, 2018 in AWS by Priyaj
• 58,020 points
1,479 views
0 votes
1 answer

How to web scrape using python without using a browser?

Yes, you can use the headless mode. ...READ MORE

answered Apr 2, 2019 in Python by Yogi

edited Oct 7, 2021 by Sarfaraz 13,066 views
0 votes
1 answer

How to use for loop in Python?

There are multiple ways of using for ...READ MORE

answered Mar 4, 2019 in Python by Priyaj
• 58,020 points
747 views
0 votes
2 answers

How to use threading in Python?

 Thread is the smallest unit of processing that ...READ MORE

answered Apr 6, 2019 in Python by anonymous
1,347 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP