PDA

View Full Version : regex accepts .couk - causes problems


songboy
04-27-2011, 01:39 PM
I'm not questioning this regular expression for an email:
if(!preg_match("/^([a-z0-9._%+-]+)@([a-z0-9.-]+)(\.[a-z]{2,4})$/",$email_address)) {
$error_email = "Please check the email address.";
include("c_writer_create_log.php");
extract($_POST);
exit();
} because it does what it says.
My problem is that I've discovered (fortunately) that it accepts something like this:
etc etc .couk notice there's no dot between co and uk.

If someone types in this error, my warning doesn't kick in and the address finds it way into the data base. A subsequent mailsend also fails. I haven't found one source that deals with this couk issue - probably because it's not an issue. The real issue is how can I block a mistake like this going into my data base ?
Help would be much appreciated -
Songboy

songboy
04-27-2011, 10:35 PM
Having trawled around all day, I found a way to go at the problem.
Firstly, do the code as above. Now if someone puts in couk it will pass.
To stop it, I used another preg_match test:

if(preg_match("/couk/",$email_address))
{
$error_email_new = "Please check the email ending.";
include("c_writer_create_log.php");
extract($_POST);
exit();
}
The test looks for couk in the email address string, if it's there the warning gets displayed and more importantly, it won't go through to the database.
Having thought about it, I could probably avoid more top level errors by extending the preg_match with 'or'.
All the best -
Songboy

edbr
04-28-2011, 01:19 AM
good result it got me searching and found an interesting function

http://www.regular-expressions.info/email.html

songboy
04-28-2011, 08:35 PM
Yep, this is where I got the one above. If you try it, it accepts couk. There's no reference to it in the article. I would have thought that this problem would have got more coverage than it did.
All the best -
Songboy

edbr
04-29-2011, 01:17 AM
yes so would i, frankly hadnt thought about it. the 2,4 limit might be the key as most domains other than .com,.net etc are 3 characters so i was thinking of trying , but have not a clue if it would work

"/^([a-z0-9._%+-]+)@([a-z0-9.-]+)(\.[a-z]{2,3+})(\.[a-z]{0,3+})$/"

otherwise a preg_replace before entering into the db perhaps