467,166 Members | 1,021 Online
Bytes | Developer Community
Ask Question

Home New Posts Topics Members FAQ

Post your question to a community of 467,166 developers. It's quick & easy.

How Can ı scrape diffrent page's data onto my current data?(into excel)

import requests
from bs4 import BeautifulSoup
from pprint import pprint
import pandas as pd

page = requests.get('https://www.arabam.com/ikinci-el/motosiklet/honda?view=List&take=50')
soup = BeautifulSoup(page.text, 'lxml')
rows = []
kolon = []
for tr in soup.select('tr'):
row = [td.text.strip() for td in tr.select('td') if td.text.strip() and td.text.strip() != '-']
if len(row)>6:
row[7] = row[7][:-98]

for column in rows[1:]:

from openpyxl import load_workbook
writer =pd.ExcelWriter('test.xlsx', engine='openpyxl')
wb = writer.book
df = pd.DataFrame(kolon)
df.to_excel(writer, index=False)

# We scraped 51 row car information with those codes from first page into test.xlsx but we have to make that 500 row but other datas are in next pages and we dont know how to add these datas onto current test.xlsx file. How can we add diffrent page's datas to previous datas?

Nov 23 '20 #1
  • viewed: 2427
2 Replies

Nov 23 '20 #2
Just downloading the first page and then using beautifulsoup isn't going to cut it, you need the 'brains' to also parse the javascript or other scripts that feature extra functionality in the page. You need to use some framework like selenium to interface a browser.
Nov 26 '20 #3

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

1 post views Thread by Grey | last post: by
reply views Thread by Marc Scheuner | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.