Wednesday, May 31, 2006

Availablity

Not really oracle related, other than it runs on an oracle database and I don't like to pick on a company in public, but.... We have a product that uses ESRI's SDE software for mapping. We get most of our support through a vendor who has a very good working knowledge of ESRI. We have been getting some pressure to make the software available 24x7 due to changes in the staffing model. For maintance, SDE requires no users accessing the software, that doesn't make 24x7 very easy to do. We spoke with the vendor with no success. So I thought I would contact ESRI direct. I go and look up the page, find the ESRI developer site. I really and truly started to laugh out loud as I read the home page at http://edn.esri.com/index.cfm?fa=home.welcome . To make it easy to find, here is a copy/paste: Announcements Developer Summit—More Developer Summit session information has been posted. The information can be found under the Media Center tab. We apologize for the delay in posting these items. Limited Availability on Sunday—Each Sunday EDN's availability between approximately 8:00 a.m. and 1:30 p.m. (Los Angeles Time) may be interrupted intermittently due to routine maintenance. We apologize for any inconvenience this may cause you. I have a feeling trying to convince them that downtime for maintenance is not acceptable is going to be a very, very hard sell.

Tuesday, May 30, 2006

Another for the WTF

Question from a developer: "Why does this query take so long to process?"
    SELECT 1 FROM HEADER WHERE 1 IN (SELECT 1 FROM DETAIL WHERE 1=1 AND DETAIL.HEADERID = '3432223' AND DETAIL.SITEID='OVA3' AND DETAIL.status in ('DKAPPR','DFFAPPR','UIUAPPR')) ;
Can anybody spot the problem? They are looking for a simple return, they do not look at the value of the return as long as one row is returned, the java returns TRUE and they continue on. They run the above SQL for every record in HEADER that they are looking to update to look to see if there are corresponding detail lines. The HEADER table has 200K + records. I suggested this one to them.
    SELECT 1 FROM HEADER WHERE 1=1 AND HEADERID IN (SELECT HEADERID FROM DETAIL WHERE 1=1 AND DETAIL.HEADERID = '200504505' AND DETAIL.SITEID='OVA3' AND DETAIL.status in ('DKAPPR','DFFAPPR','UIUAPPR') ) ;
There are much better ways to do this, I will let the developer work on those, he is very embaressed and is spending the needed time on the issue and has since this one found other queries that were along the similiar lines.

Wednesday, May 24, 2006

Not to start a war here or anything

BUT, tomorrow is the beginning of the end for Windows Oracle Database servers in our company. The very LAST oracle database that runs on windows is in the pipe to be converted to HPUX, due date - June 4th, but this might get pushed back a couple of weeks due to factors I can't control. Testing is ongoing, but it looks good, users are impressed by the performance and the vendor is happy. 132 Databases in the company, 131 on a flavour of unix (HPUX,Linux many types, AIX and Solaris) and ONE windows. Free of windows at last... Now, if I could only convince management to allow me to run linux on my laptop. Now... to start the conversion process off of HPUX RISC onto something else... I am thinking AIX. :)

Sunday, May 21, 2006

Multiple Failures

Yesterday (Saturday) was a very bad day... today isn't much better, and tomorrow is looking pretty gloomy too. Yesterday, had a major power failure at our main data center. That was the first failure, UPS all kicked in and started working, but notification system to tell everybody that the UPS were working failed, nobody knew we were on UPS. Second failure. Generator kicks in as it is supposed to, but fuel line from the main tank disconnects from the generator, generator runs out of petrol and stops. Third Failure. UPS systems ran out after a few hours and everything went down as hard as hard can be. As it was the weekend, user load was low, but when the systems went "blank" and didnt' come back, the users noticed. They go to our online emergency contact page to get numbers - gone, servers have no power. They go to the fancy "in-phone" directory listing for our new VOIP phones, gone, server has no power. They can't send emails, email system is in the data center. New system, the redundant backup system is still sitting beside the production system for final testing. Somebody looks up the managers home phone via 411 and gives him a call, and the word goes out. Holiday long weekend, NOBODY around. Manager is calling relatives looking for people. Generator gets repaired and fired up. We are at about 4 hours of downtime now. Everything comes back and the UPS start to charge. We do nothing until the UPS charge fully... 2 hours more hours, start to bring everything back up. Our new SAN won't come back. Our new NAS won't come back. Both have been rigoursly tested and guarenteed by the manufacturer. Won't come back. A manager is driving around the city looking for the senior system administrator's car hoping he is out for dinner or shopping instead of gone for the weekend. His cell phone died last week and it hasn't been fixed yet. Believe it or not, he finds him and his family. Only takes an hour more and everything including the SAN and NAS is back and functioning. Time for the DBA's to take over. 23 databases come back no problem, automatic crash recovery, no issues. Three don't. We get one back up and running with a rman restore via control files (MAN I love RMAN). The second one is actually our primary rman catalog database, it is gone, history. It of course doesn't back up to a catalog server, entire blade is corrupted. No hope of fixing. Rebuild it and restore from an export on tape, back and running in 3 hours. The third one is easily restored via rman. Find out during the restore of the export, that the brand new very expensive tape array is not functioning. That is being rebuilt as I type. New lesson learned. When you are configuring and testing your redundant systems. Don't build them at your local site and move them when done, instead ship them direct to where they are going and build them there. Turns out the mail notifications on the UPS being kicked on didn't get sent because the UPS on the network gear the data center's SMTP server is on failed and the server went down hard. UPS has a "certified working" sticker from 22 days ago. No data loss to any of the databases thank goodness for that. The meetings next week on this are going to be very bad indeed.

Tuesday, May 09, 2006

Use what you learn

Way back, way way back I remembered reading about the "WITH query_name (select ...) subq select * from subq", which oracle calls Subquery Factoring I remember thinking at the time "that's neat" and then of course promptly forgot about it. Well, today I spent about 3 hours trying to optmize an audit report on changes to employees in our payroll system. A huge nasty bit of SQL that made numerous FTS into the audit table and the employee table to gather the audit information the way the users demanded it. I tried materialized views, rewriting the queries, even considered pipelined functions, then for some weird reason the old noodle remembered the subqueries - 1 minute to find it in the SQL reference guide, another 5 to rebuilt the SQL. Cost on the query cut from over 30,000 to 543. The time of the query, cut from over 8 minutes to under 5 seconds. This is a direct copy/paste straight out of the 10G SQL reference manual:


 Subquery Factoring: 
Example The following statement creates the query names dept_costs 
and avg_cost for the initial query block containing a join, and then 
uses the query names in the body of the main query. 

WITH dept_costs AS ( SELECT department_name, SUM(salary) dept_total 
 FROM employees e, departments d WHERE e.department_id = d.department_id GROUP BY department_name)
 , avg_cost AS ( SELECT SUM(dept_total)/COUNT(*) avg FROM dept_costs) 

SELECT * FROM dept_costs 
 WHERE dept_total > (SELECT avg FROM avg_cost) 
 ORDER BY department_name; 

DEPARTMENT_NAME                DEPT_TOTAL 
------------------------------ ---------- 
Sales                          313800 
Shipping                       156400 

Nothing new or earth shattering here... just a comment on how the brain works.