Extract filename and path from URL in bash script


Question

In my bash script I need to extract just the path from the given URL. For example, from the variable containing string:

http://login:password@example.com/one/more/dir/file.exe?a=sth&b=sth

I want to extract to some other variable only the:

/one/more/dir/file.exe

part. Of course login, password, filename and parameters are optional.

Since I am new to sed and awk I ask you for help. Please, advice me how to do it. Thank you!

1
22
8/29/2009 9:05:33 AM

Accepted Answer

In bash:

URL='http://login:password@example.com/one/more/dir/file.exe?a=sth&b=sth'
URL_NOPRO=${URL:7}
URL_REL=${URL_NOPRO#*/}
echo "/${URL_REL%%\?*}"

Works only if URL starts with http:// or a protocol with the same length Otherwise, it's probably easier to use regex with sed, grep or cut ...

29
1/29/2015 5:30:11 AM

There are built-in functions in bash to handle this, e.g., the string pattern-matching operators:

  1. '#' remove minimal matching prefixes
  2. '##' remove maximal matching prefixes
  3. '%' remove minimal matching suffixes
  4. '%%' remove maximal matching suffixes

For example:

FILE=/home/user/src/prog.c
echo ${FILE#/*/}  # ==> user/src/prog.c
echo ${FILE##/*/} # ==> prog.c
echo ${FILE%/*}   # ==> /home/user/src
echo ${FILE%%/*}  # ==> nil
echo ${FILE%.c}   # ==> /home/user/src/prog

All this from the excellent book: "A Practical Guide to Linux Commands, Editors, and Shell Programming by Mark G. Sobell (http://www.sobell.com/)


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon