Command: grep -oE '\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\ b' sample.txt
This command finds valid email addresses in the text.
Detail:
\b : Marks the beginning and end of the word.
[A-Za-z0-9._%+-]+ : Defines the username of the e-mail address. Uppercase letters, lowercase letters, numbers, and certain special characters (period, underscore, percent sign, plus sign, etc.) are allowed here and at least one character is required.
@ : Represents the "@" symbol of the e-mail address.
[A-Za-z0-9.-]+ : Defines the domain part of the e-mail address. Here again, uppercase letters, lowercase letters, numbers, and certain special characters (dots and dashes) are allowed and at least one character is required.
\. : Indicates the dot character.
[A-Za-z]{2,} : Represents the top-level domain of the email address. This section must contain at least two letters.
Command: grep -oE '\b([0-9]{1,3}\.){3}[0-9]{1,3}\b' sample.txt
This command finds IP addresses in text.
Detail:
\b : Marks the beginning and end of the word.
([0-9]{1,3}\.){3} : This part represents the first three sets of digits and a dot character of the IP address. Here are more detailed explanations:
([0-9]{1,3}\.) : Represents the first three groups of digits. Within this group, each digit can have 1 to 3 digits (for example, 1, 12, 123). The dot character is found at the end of each group of digits.
{3} : This group is repeated three times, representing four groups of numbers.
[0-9]{1,3} : This represents the last group of digits. Each digit can be 1 to 3 digits long.
\b : Marks the beginning and end of the word.
# Using grep and regex to extract IP addresses
grep -oE "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" access.log
# Using grep and regex to extract accessed URLs
grep -oE '"GET [^"]+"' example.log | cut -d' ' -f2
# Example of parshing to 2 files
grep -oE "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" example.log && sort -Vr ips.txt >> ipsort.txt