Say you’d like to find out the IP addresses of lines in your apache access.log (or any log file with a similar format, really) that contain “Googlebot”:
grep 'Googlebot' access.log | cut -d' ' -f1 | sort | uniq
which finds the lines via grep, uses cut to extract the first field (space delimited), sorts the IP addresses and then uniqifies them.
Dirt simple, stupidly powerful.
thanks for posting the article please state the exact location where we have past this code “grep ‘Googlebot’ access.log | cut -d’ ‘ -f1 | sort | uniq”
That’s something you’d run in a bash shell, where “access.log” is the path to your apache access log.
Very useful. My home web server is suffering from DOS attacks by something trying to edit pages on my wiki when they don’t have permission. Using your tip I extracted all the ip addresses of the morons from my apache log and added them to hosts.deny.
Glad to be of service. I’ve got another blog entry cooking that puts this in a bash for loop so you can inspect the actual traffic on an individual basis via nested greps. Very handy.
So glad to have found this page. When I was purposely seeking this information (using grep to extract uniq IPs), I couldn’t find it – but lo and behold, I found it while searching on another topic.
Thanks very much…
Pingback: Neal Poole » Apache Tip: How to count requests per IP
Pingback: Carson Farmer » Blog Archive » Because its fun to map stuff…