BASH script: Downloading consecutive numbered files with wget


Question

I have a web server that saves the logs files of a web application numbered. A file name example for this would be:

dbsclog01s001.log
dbsclog01s002.log
dbsclog01s003.log

The last 3 digits are the counter and they can get sometime up to 100.

I usually open a web browser, browse to the file like:

http://someaddress.com/logs/dbsclog01s001.log

and save the files. This of course gets a bit annoying when you get 50 logs. I tried to come up with a BASH script for using wget and passing

http://someaddress.com/logs/dbsclog01s*.log

but I am having problems with my the script. Anyway, anyone has a sample on how to do this?

thanks!

1
44
9/15/2009 11:10:19 AM

Accepted Answer

#!/bin/sh

if [ $# -lt 3 ]; then
        echo "Usage: $0 url_format seq_start seq_end [wget_args]"
        exit
fi

url_format=$1
seq_start=$2
seq_end=$3
shift 3

printf "$url_format\\n" `seq $seq_start $seq_end` | wget -i- "$@"

Save the above as seq_wget, give it execution permission (chmod +x seq_wget), and then run, for example:

$ ./seq_wget http://someaddress.com/logs/dbsclog01s%03d.log 1 50

Or, if you have Bash 4.0, you could just type

$ wget http://someaddress.com/logs/dbsclog01s{001..050}.log

Or, if you have curl instead of wget, you could follow Dennis Williamson's answer.

61
6/3/2018 8:27:08 AM

curl seems to support ranges. From the man page:

URL  
       The URL syntax is protocol dependent. You’ll find a  detailed  descrip‐
       tion in RFC 3986.

       You  can  specify  multiple  URLs or parts of URLs by writing part sets
       within braces as in:

        http://site.{one,two,three}.com

       or you can get sequences of alphanumeric series by using [] as in:

        ftp://ftp.numericals.com/file[1-100].txt
        ftp://ftp.numericals.com/file[001-100].txt    (with leading zeros)
        ftp://ftp.letters.com/file[a-z].txt

       No nesting of the sequences is supported at the moment, but you can use
       several ones next to each other:

        http://any.org/archive[1996-1999]/vol[1-4]/part{a,b,c}.html

       You  can  specify  any amount of URLs on the command line. They will be
       fetched in a sequential manner in the specified order.

       Since curl 7.15.1 you can also specify step counter for the ranges,  so
       that you can get every Nth number or letter:

        http://www.numericals.com/file[1-100:10].txt
        http://www.letters.com/file[a-z:2].txt

You may have noticed that it says "with leading zeros"!


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon