Back

Local Web Pages: Saving Web Pages To Your Computer


Code for downloading individual web pages from specific sites and saving them on your computer in a separate folder for each site.

# download one page from habr.com, specify the site domain as the folder

# -D habr.com,fonts.gstatic.com,assets.habr.com,habrastorage.org - allow downloading images, css, js, etc. from these domains and subdomains

# -b, download in background

# download article page https://habr.com/ru/articles/581212/

sudo wget -P ~/Documents/vovchuks.org/other/local_internet_sites/habr.com -E -H -k -K -p -nc --save-cookies cookies.txt --referer="https://www.google.com/" --user-agent="Mozilla/5.0 (X11; Linux x86_64; rv:122.0) Gecko/20100101 Firefox/122.0" --no-check-certificate -D habr.com,fonts.gstatic.com,assets.habr.com,habrastorage.org https://habr.com/ru/articles/581212/

---------8<---------

# download comments page https://habr.com/ru/articles/581212/comments/

sudo wget -P ~/Documents/vovchuks.org/other/local_internet_sites/habr.com -E -H -k -K -p -nc --save-cookies cookies.txt --referer="https://www.google.com/" --user-agent="Mozilla/5.0 (X11; Linux x86_64; rv:122.0) Gecko/20100101 Firefox/122.0" --no-check-certificate -D habr.com,fonts.gstatic.com,assets.habr.com,habrastorage.org https://habr.com/ru/articles/581212/comments/

---------8<---------

# download one page from en.wikiquote.org, specify the site domain as the folder

sudo wget -P ~/Documents/vovchuks.org/other/local_internet_sites/en.wikiquote.org -np -E -H -k -K -p -nc --save-cookies cookies.txt --referer="https://www.google.com/" --user-agent="Mozilla/5.0 (X11; Linux x86_64; rv:122.0) Gecko/20100101 Firefox/122.0" --no-check-certificate -e robots=off https://en.wikiquote.org/wiki/George_Orwell

---------8<---------

# download one page from en.wikipedia.org, specify the site domain as the folder

sudo wget -P ~/Documents/vovchuks.org/other/local_internet_sites/en.wikipedia.org -np -E -H -k -K -p -nc --save-cookies cookies.txt --referer="https://www.google.com/" --user-agent="Mozilla/5.0 (X11; Linux x86_64; rv:122.0) Gecko/20100101 Firefox/122.0" --no-check-certificate -e robots=off https://en.wikipedia.org/wiki/George_Orwell

---------8<---------

# after downloading

sudo chown -R mika:mika ~/Documents/vovchuks.org/other/local_internet_sites/

sudo find ~/Documents/vovchuks.org/other/local_internet_sites/ -type d -exec chmod 0755 {} \;

sudo find ~/Documents/vovchuks.org/other/local_internet_sites/ -type f -exec chmod 0644 {} \;


^ Back to top



Back Modified , email