I have been having, "heated discussions" with a few (male, read on) OCP's from a subsidiary company. Arguing over some Oracle B*tree index myths and other things they "know" about oracle. These fellows are 10g OCP, but all straight out of university and very gung ho. ADF over oracle, .NET over oracle (shudder) and no need for any business logic in the database. Very smart and knowledgeable fellows who are an asset to the company. I just keep slamming them on their database interaction work as I do the performance monitoring on their databases and they are a little sick and tired of ticket after ticket after ticket coming their way of "Bad SQL" or "Inefficient SQL" and the one that started this "Unused Indexes" and thought they would speak up. All very friendly of course.
We came to a agreement that I would put together a nice document blowing away their index theories. I will work on the rest of their theories later. When I have successfully debunked their myths, they will wear skirts,bobby socks, a blouse and makeup to work for a period of no less than 4 straight hours covering at minimum of two of the following - Morning arrival,Morning coffee break, lunch, afternoon coffee break or 5pm departure. If I fail, I will have to wear the same outfit. We have manager approval and backing for this and odds are it will end up in the company newsletter. Not entirely fair as they have 9-12 people at their location. I have 200+ people in this office, but I think I have the legs to pull it off :).
I personally believe these to be myths, well one because Thomas Kyte says so, and two I did some testing and checked it out in the past.
The following are the 3 index myths we decided on, well statements I told them were wrong and they argued the point. They are going to put together tests to prove me wrong. In a event of a tie or unbreakable test cases, a simple toss of a coin will decide our fates. Best out of 3 wins.
Myth #1
You must use a index, full tablescans are always less efficient and slower.
Myth #2
If you index each individual column the optimizer will use multiple indexes in a query if the predicate has any combination of the columns, so build a index on every column.
Myth #3
You must build a calculated column to properly index a date field. So to store a date, the best way to do it is with a VARCHAR field to improve DML to the table.
Over the next few days, I will be putting together the tests and posting the results. Since we never spoke about the ability to look for outside corroboration or help, I thought I would do the tests, and then run them by my blog to see if any readers can find fault in them. Due date for the tests is the end of the month. Management didn't want us using too much company time for this childishness. I know I have 2 and 3 in the bag.. #1 is going to be hard to prove, I think in one of Tom's books he does something similar. I will be reading those chapters over again tonight.
Parameters are Oracle 10gR2 on windows (shudder) with the following somewhat abbreviated init.ora (some parameters removed):
orcl.__db_cache_size=192937984
orcl.__java_pool_size=12582912
orcl.__large_pool_size=4194304
orcl.__shared_pool_size=75497472
orcl.__streams_pool_size=0
*.compatible='10.2.0.1.0'
*.db_block_size=8192
*.optimizer_mode=ALL_ROWS*.db_file_multiblock_read_count=16
*.db_recovery_file_dest_size=2147483648
*.job_queue_processes=10
*.open_cursors=300
*.pga_aggregate_target=96468992
*.processes=150
*.sga_target=289406976
*.undo_management='AUTO'
*.undo_tablespace='UNDOTBS1'
I am probably going to regret this is some way. But WTF you only live once. I think my wife is still laughing, she offered to shave my legs for me. Evil, Evil lady :)
Stay tuned, I am going to spend some time on building these tests.
4 comments:
I was only concerned with #1 because I couldn't remember where I had seen the debunking of it.
But I did find it in Tom's book last night so I am confident.
Myth #1 came out of a discussion about rules of thumb that if you are accessing a certain percentage of rows you must use a index.
The myth got percolated down to what I wrote.
I am pretty sure I can have all 3 of these written up and off to the soon to be public cross dressers before the weekend.
My manager left me a note that he has ordered 2 copies of Tom's newest book for the fellows.
I will have to make sure I don't plagiarize him exactly.
I think your biggest 'threat' to myth 1 is full index scans. If all the required columns are in an index (and at least one is not null) then you'd probably get better performance from the index scan than the tablescan.
Of course that would mean that if you want all the columns, they would all need to be in the index in which case you maybe want a index-organised table.
Foote's demo of the threshold myth. (And some other useful info.)
And there is a difference between plagiarism and scholarly discourse - attribution! :-)
Are you sure it's punishment for them to dress up in women's clothing? :-)
Yes, I found out through the "grapevine" that they are rather concerned as they went ahead to prove me wrong and have ended up proving themselves wrong.
I will have my final stuff up on the blog before Monday.
Post a Comment