From owner-freebsd-questions Thu May 11 10:27:21 2000 Delivered-To: freebsd-questions@freebsd.org Received: from larryboy.graphics.cornell.edu (larryboy.graphics.cornell.edu [128.84.247.48]) by hub.freebsd.org (Postfix) with ESMTP id BD04537B973 for ; Thu, 11 May 2000 10:27:17 -0700 (PDT) (envelope-from mkc@larryboy.graphics.cornell.edu) Received: from larryboy.graphics.cornell.edu (mkc@localhost) by larryboy.graphics.cornell.edu (8.9.3/8.9.3) with ESMTP id NAA86965; Thu, 11 May 2000 13:27:15 -0400 (EDT) (envelope-from mkc@larryboy.graphics.cornell.edu) Message-Id: <200005111727.NAA86965@larryboy.graphics.cornell.edu> To: "Dan Larsson" Cc: questions@FreeBSD.ORG Subject: Re: regexp driving me nuts, help needed! In-Reply-To: Message from "Dan Larsson" of "Thu, 11 May 2000 18:42:59 +0200." Date: Thu, 11 May 2000 13:27:15 -0400 From: Mitch Collinsworth Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG >I need to get the domain and tld from an url. > >this my idea of what would match and return 'domain.com': >echo http://www.domain.com/html.asp | sed -e 's/\([\.a-zA-Z0-9]+[a-zA-Z]{2,3}\ >)/\1 /g' > >But that's not what sh thinks ( it returns the whole url ) >What regexp should I use to get the desired result? Here's a perl 1-liner: echo http://www.domain.com/html.asp |\ perl -e '$u=<>; $u=~s/http:\/\///; $u=~s/^www.//i; $u=~s/\/.*$//; print $u' domain.com This works in stages, so it doesn't depending on the starting string always containing all syntactical elements. -Mitch To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message