As a database programmer, addresses will crop up from time to time with the inevitability of death and taxes. Depending on the state of the database your working on at the time, it can feel like either or, in extreme cases, both.
This post is about the joy of UK postcodes and how you can validate that they are correctly formatted with…wait for it….regular expressions.
Before you run off screaming into the woods, I’d hasten to add that this can be achieved without overdue exertion of the top row of your keyboard.
First off, let’s look at what constitutes a valid format for a UK postcode. At this point, we need to refer to the ominously named BS7666.
It states that the following are valid formats (where A is a letter and N is a number) :
- AN NAA
- ANN NAA
- AAN NAA
- AANN NAA
- ANA NAA
- AANA NAA
…and also, just in case that sounds too much like your singing along to a song and have forgotten the words ( or possibly, are calling the name “Anna” whilst falling off a cliff), GIR 0AA.
What rules can we divine from this wisdom ?
Well, it seems that any valid format must :
- start with a letter
- must have at least one and up to 3 further characters before a space
- The space must be followed by a number
- There must then be at least a further two, but no more than three, alphanumeric characters
- must be a maximum of 8 charaters ( including the space)
Time to visit regular expression land :
[:alpha:] is any letter
[:digit:] is a number
[:alnum:] is any letter or number
{n} means match the preceeding charater set n times
{x,y} match the preceeding charter set at least x times but no more than y times
^ – anchor the expression to the start of the line
NOTE I’m indebted to Martin ( see comments), who pointed out that there were a couple of issues with my original solution.
The correct function definition is below.
So, putting this all together :
create or replace function validate_postcode_format_fn( pa_pcode in varchar2) return boolean as begin if pa_pcode = 'GIR 0AA' then return true; end if; if length( pa_pcode) > 8 then return false; end if; if regexp_like( pa_pcode, '^[[:alpha:]]{1,2}[[:digit:]]{1,2}[[:alpha:]]{0,1}[[:space:]]{1}[[:digit:]]{1}[[:alpha:]]{2}') then return true; end if; return false; end; /
And now for the test script …
set serveroutput on size 1000000 accept pcode prompt 'Enter Postcode : ' DECLARE l_pcode VARCHAR2(8) := UPPER('&pcode'); BEGIN IF validate_postcode_format_fn( l_pcode) THEN DBMS_OUTPUT.PUT_LINE(l_pcode||' is a valid postcode format'); ELSE DBMS_OUTPUT.PUT_LINE( l_pcode||' is not a valid postcode format'); END IF; END; /
Of course, it this doesn’t guarantee that what’s entered is a valid postcode ( try SIL LY1), but at least it does filter out the obviously incorrect stuff.
A shame you haven’t got a single thank you yet.
Here’s one – thank you! Great job, saved me the hassle of firing up Expresso for the afternoon!
LikeLike
Pete,
thanks. Glad you found it useful.
Mike
LikeLike
Angus MacLeod works for NHBC and he found your script whilst I was off for a week (pure 100% coincidence). Then he sent it to me for a QA.
I failed it!!!
I tested on MK5 8PF (pass), MK5 8P3 (should fail), WC1A 4AA (OK), WC1A AAA (should fail), L1 3AA (OK), LL 3AA (should fail), LL17 3AA (OK), and LL17 33A (should fail).
It passed them all.
Try ^[[:alpha:]]{1,2}[[:digit:]]{1,2}[[:alpha:]]{0,1} {1}[[:digit:]]{1}[[:alpha:]]{2} instead.
I will be writing to Mr Smithers about this!!
LikeLike
Martin,
That Mr Smithers, what a scamp he is.
5 years and 1500-odd views since this was published, you’re right.
There is one minor point with your proposed solution in that GIR 0AA is reported as an error.
However, as it’s the only UK postcode with this format, we can simply check for it to start with
rather than checking it with the regexp.
The post has been suitably amended.
Thanks to you and Angus for doing the QA on this 🙂
Mike
LikeLike
thanks for doing all the hard work for me
LikeLike
Thank you Mike for this post! This saved my time.
LikeLike