Oracle – Pinning table data in the Buffer Cache

As I write, Euro 2016 is in full swing.
England have managed to get out of the Group Stage this time, finishing second to the mighty…er…Wales.
Fortunately Deb hasn’t mentioned this…much.

In order to escape the Welsh nationalism that is currently rampant in our house, let’s try something completely unrelated – a tale of Gothic Horror set in an Oracle Database…

It was a dark stormy night. Well, it was dark and there was a persistent drizzle. It was Britain in summertime.
Sitting at his desk, listening to The Sisters of Mercy ( required to compensate for the lack of a thunderstorm and to maintain the Gothic quotient) Frank N Stein was struck by a sudden inspiration.
“I know”, he thought, “I’ll write some code to cache my Reference Data Tables in a PL/SQL array. I’ll declare the array as a package header variable so that the data is available for the entire session. That should cut down on the amount of Physical IO my application needs to do !”

Quite a lot of code later, Frank’s creation lurched off toward Production.
The outcome wasn’t quite what Frank had anticipated. The code that he had created was quite complex and hard to maintain. It was also not particularly quick.
In short, Frank’s caching framework was a bit of a monster.

In case you’re wondering, no, this is not in any way autobiographical. I am not Frank (although I will confess to owning a Sisters of Mercy album).
I am, in fact, one of the unfortunates who had to support this application several years later.

OK, it’s almost certain that none of the developers who spawned this particular delight were named after a fictional mad doctor…although maybe they should have been.

In order to prevent others from suffering from a similar misapplication of creative genius, what I’m going to look at here is :

  • How Oracle caches table data in Memory
  • How to work out what tables are in the cache
  • Ways in which you can “pin” tables in the cache (if you really need to)

Fortunately, Oracle memory management is fairly robust so there will be no mention of leeks

Continue reading

What’s Special About Oracle ? Relational Databases and the Thick Database Paradigm

A wise man (or woman – the quote is unattributed) once said that assumption is the mother of all cock-ups.
This is especially true in the wonderful world of databases.
The term NoSQL covers databases as different from each other as they are from the traditional Relational Database Management Systems (RDBMS).
The assumption implicit in that last sentence is that Relational Databases are broadly the same.

The problems with this assumption begin to manifest themselves when a team is assembled to write a new application running on an Oracle RDBMS.

Non-Oracle developers may have been used to treating databases as merely a persistence layer. Their previous applications may well have been written to be Database Agnostic.
This is a term which is likely to cause consternation among Oracle Developers, or at least, Oracle Developers who have ever tried to implement and support a Database Agnostic application running on Oracle. They may well think of this approach as the “Big Skip” anti-pattern where the database is treated as a dumping ground for any old rubbish the application feels like storing.

As a consequence, they will strongly favour the application being “Front-End Agnostic”. In other words, they will lean toward the Thick Database Paradigm as a template for application architecture.
With all of this Agnosticism about it’s amazing how religious things can get as the relative merits of these opposing views are debated.

These diametrically opposing views on the optimum architecture for a database centric application all stem from that one assumption about Relational Databases.
To make things even more interesting, both sides in this debate share this assumption.
The fact of the matter is that Oracle is very different from other RDBMSs. Oracle Developers need to appreciate this so that they can accept that the Database Agnostic Architecture is a legitimate choice for some RDBMSs and is not simply the result of non-Oracle Developers not knowing anything about databases.
The other point to note is that Oracle is very different from other RDBMS – OK, it’s technically the same point, but it’s such an important one, it’s worth mentioning twice.
Non-Oracle Developers need to understand this so that they can accept that the Thick Database Paradigm is a legitimate choice for the Oracle RDBMS and not simply the result of technological parochialism on the part of Oracle Developers.

Whatever kind of developer you are, you’re probably wondering just what I’m banging on about right now and where this is going.
Well, the purpose of this post is to take several steps back from the normal starting point for the debate over the optimal application architecture for a Database Centric Application on Oracle and set out :

  • Why Relational Databases are different from each other
  • Why the Thick Database Approach can be particularly suited to Oracle
  • Under what circumstances this may not be the case

Hopefully, by the end I’ll have demonstrated to any non-Oracle Developers reading this that the Thick Database Paradigm is at least worth considering when developing this type of application when Oracle is the RDBMS.
I will also have reminded any Oracle Developers that Oracle is a bit different to other RDBMS and that this needs to be pointed out to their non-Oracle colleagues when the subject of application architecture is being discussed.
I will attempt to keep the discussion at a reasonably high-level, but there is the odd coding example.
Where I’ve included code, I’ve used the standard Oracle demo tables from the HR application.
There are several good articles that do dive into the technical nitty-gritty of the Thick Database Paradigm on Oracle and I have included links to some of them at the end of this post.

I can already hear some sniggering when the term Thick Database gets used. Yes, you there in the “Web Developers Do It Online” t-shirt.
In some ways it would be better to think of this as the Intelligent Database Paradigm, if only to cater to those with a more basic sense of humour.

Continue reading

Null is Odd…or Things I Used to Know about SQL Aggregate Functions

Brendan McCullum recently played his final Test for New Zealand.
That’s something of an understatement. In his last game he made a century in a mere 54 balls, a feat unmatched in 139 years of test cricket.
From the outside looking in, it seemed that McCullum had come to realise something he’d always known. Playing cricket is supposed to be fun.
What’s more, you can consider yourself quite fortunate if you get paid for doing something you enjoy, especially when that something is hitting a ball with a stick.

With the help of Mr McCullum, what follows will serve to remind me of something I’ve always known but may forget from time to time.
In my case, it’s the fact that NULL is odd. This is especially true when it comes to basic SQL aggregation functions.
Continue reading

Standard SQL ? – Oracle REGEXP_LIKE

Is there any such thing as ANSI Standard SQL ?
Lots of databases claim to conform to this standard. Recent experience tends to make me wonder whether it’s more a just basis for negotiation.
This view is partly the result of having to juggle SQL between three different SQL parsers in the Cloudera Hadoop infrastructure, each with their own “quirks”.
It’s worth remembering however, that SQL differs across established Relational Databases as well, as a recent question from Simon (Teradata virtuoso and Luton Town Season Ticket Holder) demonstrates :

Is there an Oracle equivalent of the Teradata LIKE ANY operator when you want to match against a list of patterns, for example :

like any ('%a%', '%b%')

In other words, can you do a string comparison, including wildcards, within a single predicate in Oracle SQL ?

The short answer is yes, but the syntax is a bit different…. Continue reading

Getting the current SQL statement from SYS_CONTEXT using Fine Grained Auditing

The stand-off between Apple and the FBI has moved on. In essence both sides have taken it in turns to refuse to tell each other how to hack an iPhone.

Something else that tends to tell little or nothing in the face of repeated interrogation is SYS_CONTEXT(‘userenv’, ‘current_sql’).
If you’re fortunate enough to be running on Enterprise Edition however, a Fine Grained Auditing Policy will loosen it’s tongue.

Consider the following scenario.
You’ve recently got a job as a database specialist with Spectre.
They’ve been expanding their IT department recently as the result of their “Global Surveillance Initiative”.

There’s not much of a view from your desk as there are no windows in the hollowed out volcano that serves as the Company’s HQ.
The company is using Oracle 12c Enterprise Edition.

Everything seems to be going along nicely until you suddenly get a “request” from the Head of Audit, a Mr Goldfinger.
The requirement is that any changes to employee data in the HR system are recorded, together with the statement executed to change each record.
Reading between the lines, you suspect that Mr White – head of HR – is not entirely trusted by the hierarchy.

Whilst journalling triggers are common enough, capturing the actual SQL used to make DML changes is a bit more of a challenge.
Explaining this to Mr Goldfinger is unlikely to be a career-enhancing move. You’re going to have to be a bit creative if you want to avoid the dreaded “Exit Interview” (followed by a visit to the Piranha tank).

First of all though…. Continue reading

Adrift in a Data Lake – an Oracle Developer’s first steps in Hadoop

We live in interesting times. As I write, the political life of a great nation is currently in thrall to a wealthy right-wing populist with a rather distinctive hairstyle.
But enough about Boris Johnson.

For someone used to the way things are done in Oracle, Hadoop can be something of a culture shock.
My initial introduction to Hadoop and it’s “vibrant ecosystem” was some internally written documentation.
As with most technical documentation, it was written with the assumption of a certain level of prior knowledge on the part of the reader. For this particular reader, it proved to be an erroneous assumption.

After a half an hour of struggling through this text , I was left wondering what you would use a Khafka Banana Oozie for.

Maybe it’s for killing zombie processes ? Let’s face it, with all that Java running on your system there are bound to be a few knocking around.
I’m a little hazy on my Zombie Lore, so I’m not entirely clear as to why a rapid application of fresh fruit would pose an existential threat to your average zombie. Maybe they’re potassium intolerant ?

There are a bewildering array of tools associated with Hadoop, many of which appear to provide almost identical functionality.
For example, a relational database traditionally requires only one SQL Engine…and I’ll be taking a look at two of them.

Moving from Oracle to Hadoop can feel rather like trading your car in for a box of Lego.
If the box does contain any instructions they seem to have been translated from Java into English…by someone who doesn’t speak either.
Note to reader : please substitute your own language as appropriate.

Fortunately, there are Hadoop distributions available which bundle the core tools required to get up an running. We’ll gloss over the fact that these distributions – Cloudera and Hortonworks – don’t contain the exact same tools.

In my search to find a coherent explanation of how to use Hadoop, I’ve found that the VM provided by Cloudera, together with the introductory tutorial, is a valuable aid to familiarising myself with the basics.

To start with then, I’ll cover getting hold of the Cloudera VM and setting it up in VirtualBox.
Then I’ll go through some of the tools available and what they do.
I’ll do this from the perspective of an Oracle developer (mainly because I don’t have much option) and will point out the driftwood of familiarity that you might be able to cling to in order to stay afloat in your Data Lake.

What I’ll cover is :

  • The core components of Hadoop
  • HDFS commands
  • Transferring data between a relational database and hadoop using SQOOP
  • Querying structured data using Hive and Impala
  • Uploading data using Kite

Ready to Dive in ?

Continue reading

Live to Win – Motorhead Covers and Pythonic Irrigation

The recent passing of Lemmy has caused me to reflect on on the career of one of the bands who made my growing up (and grown-up) years that much…well…louder.

Yes, I know that serious Python documentation should employ a sprinkling of Monty Python references but, let’s face it, what follows is more of a quick trawl through some basic Python constructs that I’ve found quite useful recently.
If I put them all here, at least I’ll know where to look when I need them again.

In any case, Michael Pailin made a guest appearance on the album Rock ‘n’ Roll so that’s probably enough of a link to safisfy the Monty Python criteria.

I find Python a really good language to code in…especially when the alternative is writing a Windows Batch Script. However, there is a “but”.
Python 3 is not backward compatible with Python 2. This can make life rather interesting on occasion.

It is possible to write code that is compatible with both versions of the language and there’s a useful article here on that topic.

The code I’ve written here has been tested on both Python 2 (2.7.6) and Python 3 (3.4.3).

One of the great things about Python is that there are a number of modules supplied as standard, which greatly simplify some common programming tasks.
What I’m going to run through here is :

  • Getting information about the environment
  • Handling runtime arguments with the argparse module
  • Reading config files with configparser
  • Writing information to log files with the logging module

Continue reading