๐Ÿ ๊ธฐํƒ€/์ŠคํŒŒ๋ฅดํƒ€์ฝ”๋”ฉํด๋Ÿฝ_FE

์›น๊ฐœ๋ฐœ ์ข…ํ•ฉ๋ฐ˜ | ์›น์Šคํฌ๋ž˜ํ•‘(ํฌ๋กค๋ง) | DB | mongoDB

Dhey 2021. 12. 26. 22:47
๋ฐ˜์‘ํ˜•
ํฌ๋กค๋ง

 

·  Web์ƒ์— ์กด์žฌํ•˜๋Š” Contents๋ฅผ ๊ทธ๋Œ€๋กœ ๊ฐ€์ ธ์™€์„œ ํ•„์š”ํ•œ ๋ฐ์ดํ„ฐ๋งŒ ์ถ”์ถœํ•˜๋Š” ๊ธฐ๋ฒ•

 

# ํฌ๋กค๋ง ๊ธฐ๋ณธ ์„ธํŒ…

import requests
from bs4 import BeautifulSoup

headers = {'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36'}
data = requests.get('url ์ž…๋ ฅ',headers=headers)

soup = BeautifulSoup(data.text, 'html.parser')

 

 

# BeautifulSoup ์‚ฌ์šฉ๋ฒ•

import requests
from bs4 import BeautifulSoup

headers = {'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36'}
data = requests.get('https://movie.naver.com/movie/sdb/rank/rmovie.naver?sel=pnt&date=20210829',headers=headers)

soup = BeautifulSoup(data.text, 'html.parser')

title = soup.select_one('#old_content > table > tbody > tr:nth-child(2) > td.title > div > a')
print(title)

 

 

๋จผ์ € beautifulsoup ๋‚ด select์— ๋ฏธ๋ฆฌ ์ •์˜๋œ select์™€ select_one์˜ ์‚ฌ์šฉ๋ฒ•์„ ์ตํ˜€๋ณด์ž.

# copy selector๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•
soup.select('ํƒœ๊ทธ๋ช…')
soup.select('.ํด๋ž˜์Šค๋ช…')
soup.select('#์•„์ด๋””๋ช…')

soup.select('์ƒ์œ„ํƒœ๊ทธ๋ช… > ํ•˜์œ„ํƒœ๊ทธ๋ช… > ํ•˜์œ„ํƒœ๊ทธ๋ช…')
soup.select('์ƒ์œ„ํƒœ๊ทธ๋ช….ํด๋ž˜์Šค๋ช… > ํ•˜์œ„ํƒœ๊ทธ๋ช….ํด๋ž˜์Šค๋ช…')

ํƒœ๊ทธ์™€ ์†์„ฑ๊ฐ’์œผ๋กœ ์ฐพ๋Š” ๋ฐฉ๋ฒ•
soup.select('ํƒœ๊ทธ๋ช…[์†์„ฑ="๊ฐ’"]')

ํ•œ ๊ฐœ๋งŒ ๊ฐ€์ ธ์˜ค๊ณ  ์‹ถ์€ ๊ฒฝ์šฐ
soup.select_one('์œ„์™€ ๋™์ผ')

 

์œ„ ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•˜๋ฉด 

์ด๋Ÿฐ ๊ฐ’์ด ๋‚˜์˜ค๋Š”๋ฐ ์—ฌ๊ธฐ์„œ " " ์‚ฌ์ด์˜ ํ•˜์ดํผ๋งํฌ ์†์„ฑ์„ ์ถœ๋ ฅํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด

print(title['href'])

์ด๋Ÿฐ ๋ฐฉ์‹์œผ๋กœ ์ถœ๋ ฅํ•˜๋ฉด ๋œ๋‹ค.

 

์ถ”๊ฐ€๋กœ

ํƒœ๊ทธ ์•ˆ์˜ ํ…์ŠคํŠธ๋ฅผ ์ฐ๊ณ  ์‹ถ์„ ๋• → ํƒœ๊ทธ.text

ํƒœ๊ทธ ์•ˆ์˜ ์†์„ฑ์„ ์ฐ๊ณ  ์‹ถ์„ ๋• → ํƒœ๊ทธ['์†์„ฑ']  ์„ ์‚ฌ์šฉํ•˜๋ฉด ๋œ๋‹ค.

 

 

๊ทธ๋ ‡๋‹ค๋ฉด ์ด์ œ ๋„ค์ด๋ฒ„ ์˜ํ™” ํŽ˜์ด์ง€๋ฅผ ์ˆœ์œ„, ์˜ํ™”๋ช…, ํ‰์  ์ˆœ์œผ๋กœ ํฌ๋กค๋ง ํ•ด๋ณด์ž.

 

์˜ํ™” ์ œ๋ชฉ 1, 2์œ„๋ฅผ '์›ํ•˜๋Š” ๋ถ€๋ถ„์—์„œ ๋งˆ์šฐ์Šค ์˜ค๋ฅธ์ชฝ ํด๋ฆญ์œผ๋กœ ๊ฒ€์‚ฌ ํ•œ ํ›„ ์›ํ•˜๋Š” ํƒœ๊ทธ์—์„œ Copy selector๋กœ ๋ณต์‚ฌ'ํ•˜๋ฉด ์ฝ”๋“œ๋ฅผ ๊ฐ€์ ธ์˜ค๋ฉด ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

#old_content > table > tbody > tr:nth-child(2) > td.title > div > a     //1์œ„
#old_content > table > tbody > tr:nth-child(3) > td.title > div > a     //2์œ„

 

์—ฌ๊ธฐ์„œ #old๋ถ€ํ„ฐ tr๊นŒ์ง€๋Š” ๊ฐ™์œผ๋ฏ€๋กœ select๋ฅผ ์ด์šฉํ•˜์—ฌ tr๋“ค์„ ๋ถˆ๋Ÿฌ์˜จ๋‹ค.

movies = soup.select('#old_content > table > tbody > tr')

 

๋‹ค์Œ, ์ˆœ์œ„์™€ ์ œ๋ชฉ, ํ‰์ ์„ copy selectํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

# old_content > table > tbody > tr:nth-child(2) > td:nth-child(1) > img     //์ˆœ์œ„
# old_content > table > tbody > tr:nth-child(2) > td.title > div > a        //์ œ๋ชฉ
# old_content > table > tbody > tr:nth-child(2) > td.point      //ํ‰์ 

 

tr๊นŒ์ง€๋Š” ์ด๋ฏธ ๋ถˆ๋Ÿฌ์™”์œผ๋ฏ€๋กœ ๊ฐ๊ฐ ํ•„์š”ํ•œ ๊ฐ’๋งŒ selectํ•˜๋ฉด ์ฝ”๋“œ๋Š” ์ด๋ ‡๊ฒŒ ๋œ๋‹ค

for movie in movies:
    a = movie.select_one('td.title > div > a')
    if a is not None:
        title = a.text
        rank = movie.select_one('td:nth-child(1) > img')
        star = movie.select_one('td.point')

 

ํ•˜์ง€๋งŒ ์ด๋ ‡๊ฒŒ๋งŒ ํ•˜๋ฉด

์ด๋Ÿฐ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜ค๋Š”๋ฐ ์—ฌ๊ธฐ์„œ ์ˆœ์œ„๋ฅผ selectํ•˜๋ ค๋ฉด ์ˆœ์œ„์˜ alt๊ฐ’์„ ๊ฐ€์ ธ์™€์•ผ ํ•œ๋‹ค. 

 

๋˜ํ•œ ์ œ๋ชฉ๊ณผ ํ‰์ ์€ text๋กœ ๊ฐ€์ ธ์™€์•ผ ํ•˜๋ฏ€๋กœ .text๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ text๊ฐ’์œผ๋กœ ๋ณ€๊ฒฝํ•ด์ฃผ๋ฉด ์ตœ์ข… ์ฝ”๋“œ๋Š” ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

movies = soup.select('#old_content > table > tbody > tr')

for movie in movies:
    a = movie.select_one('td.title > div > a')
    if a is not None:
        title = a.text
        rank = movie.select_one('td:nth-child(1) > img')['alt']
        star = movie.select_one('td.point').text
        print(rank, title, star)

 

 


 

MongoDB

 

· NoSQL์˜ ๋Œ€ํ‘œ์ ์ธ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์‹œ์Šคํ…œ

 

์—ฌ๊ธฐ์„œ ์ž ๊น!

- DB(๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค)๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ์ด์œ ๋Š”?

  : ์ž˜ ์ €์žฅํ•˜๊ธฐ ์œ„ํ•ด์„œ? X , ๋ฐ์ดํ„ฐ๋ฅผ ์ž˜ ๋ฝ‘์•„ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ O

 

# MongoDB import ์ฝ”๋“œ

from pymongo import MongoClient
client = MongoClient('mongodb+srv://ekgp209:heaven1004@cluster0.7n9jh.mongodb.net/Cluster0?retryWrites=true&w=majority')
db = client.dbsparta

 

 

# pymongo ์ฝ”๋“œ ์š”์•ฝ

์ €์žฅ - ์˜ˆ์‹œ
doc = {'name':'bobby','age':21}
db.users.insert_one(doc)

ํ•œ ๊ฐœ ์ฐพ๊ธฐ - ์˜ˆ์‹œ
user = db.users.find_one({'name':'bobby'})

์—ฌ๋Ÿฌ๊ฐœ ์ฐพ๊ธฐ - ์˜ˆ์‹œ ( _id ๊ฐ’์€ ์ œ์™ธํ•˜๊ณ  ์ถœ๋ ฅ)
all_users = list(db.users.find({ (์กฐ๊ฑด์„ ๋„ฃ์–ด์ค„ ์ˆ˜๋„ ์žˆ์Œ) },{'_id':False}))

๋ฐ”๊พธ๊ธฐ - ์˜ˆ์‹œ
db.users.update_one({'name':'bobby'},{'$set':{'age':19}})

์ง€์šฐ๊ธฐ - ์˜ˆ์‹œ
db.users.delete_one({'name':'bobby'})

 

 

์•ž์„œ ์ˆ˜ํ–‰ํ•œ ๋„ค์ด๋ฒ„ ์˜ํ™” ํŽ˜์ด์ง€ ํฌ๋กค๋ง ํ•œ ๊ฒฐ๊ณผ ๊ฐ’์„ DB์— ์ €์žฅํ•ด๋ณด์ž.

์ €์žฅ์„ ์œ„ํ•ด print(rank, title, star) ๊ฐ€ ์•„๋‹Œ

doc = {
    'title' : title,
    'rank' : rank,
    'star' : star
}
db.movies.insert_one(doc)

๋กœ ์ˆ˜์ •ํ•˜์—ฌ ์‹คํ–‰ํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™์ด DB์— ์ €์žฅ๋œ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

DB ์ €์žฅ ๊ฒฐ๊ณผ

 

 

pymongo๋ฅผ ์ด์šฉํ•œ ๊ฐ„๋‹จํ•œ ์˜ˆ์ œ๋ฅผ ์—ฐ์Šต ํ•ด๋ณด์ž.

 

1. ์˜ํ™”์ œ๋ชฉ '๊ฐ€๋ฒ„๋‚˜์›€'์˜ ํ‰์  ๊ฐ€์ ธ์˜ค๊ธฐ

movie = db.movies.find_one({'title':'๊ฐ€๋ฒ„๋‚˜์›€'})
print(movie['star'])

 

2. '๊ฐ€๋ฒ„๋‚˜์›€'์˜ ํ‰์ ๊ณผ ํ‰์ ์ด ๊ฐ™์€ ์˜ํ™”์ œ๋ชฉ๋“ค ๊ฐ€์ ธ์˜ค๊ธฐ

movie = db.movies.find_one({'title':'๊ฐ€๋ฒ„๋‚˜์›€'})
star = movie['star']

all_movies = list(db.movies.find({'star':star},{'_id':False}))
for m in all_movies:
    print(m['title'])

 

3. '๊ฐ€๋ฒ„๋‚˜์›€'์˜ ํ‰์ ์„ 0์œผ๋กœ ๋งŒ๋“ค๊ธฐ

db.movies.update_one({'title': '๊ฐ€๋ฒ„๋‚˜์›€'}, {'$set': {'star': '0'}})
๋ฐ˜์‘ํ˜•