Skip to content

A SSR web scrapping application for a Luxury Watch Website, developed with Python and Scrapy

Notifications You must be signed in to change notification settings

511234/scrapy-practice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scrapy Practice

Inspiration:

Scrapy Course – Python Web Scraping for Beginners

Website to scrape:

Finished: SSR website: BestWatch
WIP: Data from API website: Strata

Screenshots

Job Overview on Zyte (Scrapy Cloud) Zyte

Data stored on Supabase Supabase

Sample data

Website: https://bestwatch.com.hk/longines-presence-l49214726.html
Screenshots: Sample Overview Sample Overview

Sample watch specification Sample watch specification

# watch_spec:
{
  "band": {
    "band_color": "Silver",
    "band_material": "Stainless Steel"
  },
  "case": {
    "back": "Transparent",
    "bezel": "-",
    "glass": "-",
    "shape": "",
    "height": "-",
    "diameter": "38mm",
    "material": "Stainless Steel",
    "lug_width": "20mm"
  },
  "dial": {
    "hands": "Silver-tone",
    "finish": "Polished",
    "indexes": "Index",
    "dial_type": "-",
    "dial_color": "Silver"
  },
  "features": {
    "watch_features": "Calendar, Stainless Steel",
    "water_resistance": "30 m"
  },
  "movement": {
    "time": "-",
    "type": "Automatic",
    "jewels": "-",
    "caliber": "L619/888",
    "reserve": "42 hours",
    "diameter": "-",
    "frequency": "-",
    "additionals": "-",
    "chronograph": "-"
  },
  "information": {
    "brand": "Longines",
    "model": "L49214726",
    "gender": "M",
    "series": "Présence",
    "limited": "Yes",
    "produced": "-"
  }
}

Installation

python3.11 -m venv venv
source venv/bin/activate
pip3 install scrapy ipython psycopg2-binary shub

Configuration

# In scrapy.cfg:
shell = ipython

How to Use

# Output to console ( & database if pipeline is set)
scrapy crawl bestwatchspider

# Output to csv
scrapy crawl bestwatch -O bestwatch.csv

Deployment at Zyte

$ pip install shub
$ shub login
$ shub deploy {id}

Practice area

scrapy shell

# Server side
fetch('http://strata.ca/')
response
response.css('div.listingTile')
response.css('div.listingTile').get()
listings = response.css('div.listingTile')

About

A SSR web scrapping application for a Luxury Watch Website, developed with Python and Scrapy

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages