Ten more database mysteries l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 27

Ten More Database Mysteries PowerPoint PPT Presentation


  • 109 Views
  • Uploaded on
  • Presentation posted in: General

Ten More Database Mysteries. Chris Lawson Database Specialists, Inc. www.dbspecialists.com [email protected] 1. The Case of the Unwanted Services. DBA detects listener starting extra services NOT defined! > lsnrctl status Services Summary …

Download Presentation

Ten More Database Mysteries

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Ten more database mysteries l.jpg

Ten More Database Mysteries

Chris Lawson

Database Specialists, Inc.

www.dbspecialists.com

[email protected]


The case of the unwanted services l.jpg

1

The Case of the Unwanted Services

  • DBA detects listener starting extra services NOT defined!

    > lsnrctl status

  • Services Summary…

  • database1 has 1 service handler(s)

  • database2 has 1 service handler(s)

  • databasez has 5 service handler(s)

  • They appear to be harmless, but what are these extra services?

  • Restarted Listener. At first all OK, then 5 extra services activated.

  • Confirmed that listener.ora file does NOT list the unwanted database.

  • DBA confirmed using correct listener.ora file.

  • Puzzle: There is no entry whatsoever for databasez.


Unwanted services solution l.jpg

Unwanted Services: Solution

  • Multiple listener services is indicative of MTS (Multi-threaded Server) Dispatchers; however, MTS was not being used!

  • DBA recalled that databasez was actually on a different server.

  • Solution: Database on different server started MTS services.

  • Key init.ora parameter:

    MTS_LISTENER_ADDRESS = "(ADDRESS=(PROTOCOL=tcp)(HOST=myhost)(PORT=1526))"

  • Services match init.ora parameter MTS_DISPATCHERS.

  • How it happened: The remote init.ora file had originated on ‘our’ server, and the MTS parameter entries were not removed.

  • Resolution: On other server, DBA corrected the init.ora entry, bounced database. Extra services automatically stopped.

  • But - that’s not all...


Unwanted services the rest of the story l.jpg

Unwanted Services: The Rest of the Story

  • These extra services turned out to be not so “harmless” after all.

  • The ‘unwanted’ services are fully functional; they will intercept connection requests and redirect them to a different server!

  • Production users were redirected to a “clone” testing database on another server, and used it for several hours!

  • This occurs because each MTS service points to a particular dispatcher on a particular server.

    > lsnrctl services

    Services Summary...

    demo has 3 service handler(s)

    D001 <machine: hohp2, pid: 3385> (ADDRESS=(PROTOCOL=tcp)(DEV=17)(HOST=[1.2.3.4)(PORT=1179))

  • Perhaps could be used for rapid failover to standby database?


Incognito dba privileges l.jpg

2

Incognito DBA Privileges

  • Application testers need to “refresh” the test database often.

  • They will need to start/shut database as the oracle user.

  • DBA, being suspicious (justly) by nature, is reluctant to provide oracle account.

  • As punishment, users frequently ask DBA to run the refresh.

  • Question: How can developers perform tasks as oracle, without actually having the oracle account?


Incognito dba privileges solution l.jpg

Incognito DBA Privileges: Solution

  • Create script to refresh database, including startup, shutdown, etc.

  • UNIX setuid feature allows programs to run with another identity. But setuid feature does not propagate to commands within the script.

  • Trick: Use a ‘wrapper’ C program that runs as oracle and calls the refresh script:

    # include <stdio.h>

    main ()

    {

    system (“./refresh.ksh”);

    }

  • Activate setuid for the wrapper program: chmod 6711 refresh

  • Users simply enter refresh to run the refresh program as oracle.

  • Thanks to Brian Keating of Database Specialists for this solution.


Nt remote control l.jpg

3

NT Remote Control

  • NT box is running an important database. It appears that the Listener is down.

  • NT server is remote.

  • Question: How can you check the listener remotely?


Nt remote control solution l.jpg

NT Remote Control: Solution

  • Use ‘SC’ command. It is part of the NT Resource Kit.

  • The SC utility is very similar to the NET commands.

  • First enable a security "context" to the remote box; e.g. map a network drive to the remote server, and provide the administrator account/password.

  • The network drive may be disconnected after SC use is done.

  • Note: A mapped network drive is not required if you are connected locally as administrator, and the remote server uses the same administrator password.

  • Run SC using the format: SC \\[SERVER] [CMD] SERVICE


Nt remote control solution continued l.jpg

NT Remote Control: Solution (continued)

Some common SC commands are:

query--------Queries the status for a service

start--------Starts a service

pause--------Sends a PAUSE control request

continue-----Sends a CONTINUE request

stop---------Sends a STOP request to a service

config-------Changes the configuration of a service

qc-----------Queries the configuration information

delete-------Deletes a service (from the registry)

create-------Creates a service (add to registry)


Nt remote control solution continued10 l.jpg

NT Remote Control: Solution (continued)

EXAMPLE:

Check status of the Oracle 8 LISTENER service.

NT server named ‘TSUNAMI’

SC \\TSUNAMI QUERY ORACLETNSLISTENER80

SERVICE_NAME: ORACLETNSLISTENER80

TYPE : 10 WIN32_OWN_PROCESS

STATE : 4 RUNNING (STOPPABLE,NOT_PAUSABLE,IGNORES_SHUTDOWN)

WIN32_EXIT_CODE : 0 (0x0)

SERVICE_EXIT_CODE : 0 (0x0)

CHECKPOINT : 0x0

WAIT_HINT : 0x0j


The sad case of the failing failover l.jpg

4

The Sad Case of the Failing Failover

  • Very critical Customer Support application for large HMO.

  • 24 x 7 crucial. Millions of monthly customers interactions.

  • IBM RS-6000 High-availability cluster (HACMP)

  • Scenario: Sudden hardware glitch causes failover to backup node.

  • Secondary node fails to start database! Database complains of missing datafile.

  • Critical application is down!

  • Objective: Early-rising DBA (from different project) not familiar with the setup needs to get database up!


The failing failover solution l.jpg

The Failing Failover: Solution

  • Investigation shows all file systems mounted. Bad News: file in question is simply not there.

  • Good News: File is not supposed to be there--only a symbolic link.

  • Each node has symbolic links pointing from “nice-looking” paths to the actual location of raw device:

    /u04/oradata/data01.dbf > /dev/r123

  • The primary node had all the links; the backup node was missing 2 symbolic links. Simply creating the links allowed database startup.


The failing failover solution continued l.jpg

The Failing Failover: Solution(continued)

  • Moral of the story: High reliability hardware only as good as the weakest link--including maintenance requirements.

  • This strategy was very susceptible to human error. Every new .dbf file required new symbolic link on both nodes. One omission completely destroyed the high-availability plan.


The case of the impatient insurance agent l.jpg

5

The Case of the Impatient Insurance Agent

  • Query to find details for expired insurance policies only returns 50 rows, but takes 2 minutes to finish.

  • Query is simple join of 2 tables: Response and Policy

    5 where-clause ‘filters’ 1 ‘filter’

    result set: 20,000 rows result set: 15,000 rows

RESP

POL

policy_id

result:

only 50 rows

  • Puzzle: How can query ever be quick, since the ‘super filter’ to reduce set to 50 rows is spread across two tables ?


Impatient agent solution l.jpg

Impatient Agent: Solution

The Trick:

1) Perform pre-processing that uses filters, but retrieves only the key (policy_id) that is used as the join column.

2) This field is retrieved via index read only--thus avoiding the table reads that account for the delays.

3) Use this result set as the starting point for the original join.

pre-processing driving “table”

original join

RESP

POL

RESP

POL

index reads only

table reads on small set only


Daffy database links l.jpg

6

Daffy Database Links

  • Version 7.3.4 database. Database links working normally. Database has been running straight for 3 weeks.

  • Suddenly, the links fail: ORA-12154: TNS:could not resolve service name

  • Investigation shows tnsnames.ora file unchanged. Link definition unchanged.

  • Listener is running Oracle 8.0.4. Adding entry to tnsnames.ora file in Oracle 8 directory causes the links to work again!

  • Tests with new links show that the tnsnames.ora file for 7.3.4 is being ignored!

  • Further, if Oracle 8 tnsnames file is removed, the one in Oracle 7 is used instead!


Database links solution l.jpg

Database Links: Solution

  • Clue: Listener had been restarted recently. But why should the Listener process (a server side function) affect finding the tns alias?

  • Metalink analyst suggested checking TNS_ADMIN; but how is that relevant? We are not creating a UNIX session.

  • Oracle Note 37808.1 clears-up the confusion:

    For database links, TNS_ADMIN takes its value from

    the value defined when the listener was started

  • That is, a client process (link) is influenced by a server-side process!

  • The listener had been started most recently with TNS_ADMIN set to 8.0.4, causing links to look in 8.0.4 tns file.

  • Unsetting TNS_ADMIN, then restarting listener caused all operations to run as expected.


All primary keys are equal but some are more equal than others l.jpg

7

All Primary Keys are Equal(But some are more equal than others)

  • Background: Application tuning often requires DBA to ‘lure’ optimizer into using certain indexes. Hints not always successful or possible.

  • Optimizer ‘likes’ certain indexes, because they typically are faster than others. The favorite: an index on Primary Key.

  • Scenario: 20 gigabyte insurance billing system. Oracle 8.0.4.

  • Particular query joins to a table called POLICY. Joinuses the PK index (policy_ID).

  • Query speed-up requires that join use new index on (PK, other col).

  • Created new double index--but optimizer would not cooperate. It foolishly insists on using the PK index, even if hint used.

  • Question: How can we get the optimizer to obey the DBA?


All primary keys are equal solution l.jpg

All Primary Keys are Equal: Solution

  • We need a way to ‘trick’ the optimizer into not using the PK index, but instead, use our ‘extra column’ index.

  • This is difficult, because the PK index is the #1 choice.

  • Trick: ‘Disguise’ the PK index as a unique index.

  • Oddity: If unique index already exists, addition of a matching PK will use the existing index.

  • In determining execution plan, optimizer will treat the new index as if it were a unique index, not a PK index.

  • So, drop the PK, build a matching unique index, then rebuild the PK.

  • Optimizer no longer stubbornly insists on using this ‘PK’ index, because it is not a true, pedigreed PK index--merely a unique index.


The sad case of the homeless archive logs l.jpg

8

The Sad Case of the Homeless Archive Logs

  • Scenario: Large medical application; hundreds of connections into database. This is a critical, 24x7 server.

  • Users suddenly complain of database ‘hanging.’

  • Alert log shows database unable to write archive log.

    ORA-00255: error archiving log 1 of thread 1, sequence # 200

    ORA-00270: error creating archive log /demo/arch/1_200.dbf

    ORA-19504: failed to create file "/demo/arch/1_200.dbf"

  • Investigation reveals disk crash on disk housing archive logs.

  • No .dbf files on that disk--only archive logs.

  • Problem: How can database operation be resumed, with minimal disruption to hundreds of users?


Homeless archive logs solution l.jpg

Homeless Archive Logs: Solution

  • DBA decided to dynamicallyredirect the archive logs, thereby avoiding need to shutdown database.

  • To change destination of archive logs:

    alter system archive log start to ‘[new path]’

  • For example:

    SVRMGRL> alter system archive log start to '/demo/arch2/'

  • Once the ‘backlog’ of archive logs is corrected, database automatically resumes normal operation.

  • Alert log now shows:

    Fri Feb 11 09:00:47 2000

    ARCH: Archival started

    Archiver process freed from errors. No longer stopped.


The puzzling affair of the old archive logs l.jpg

9

The Puzzling Affair of the Old Archive Logs

  • With database in Archive Mode, old archive logs must be purged.

  • On Unix, this is simple, using the find command (in cron).

  • For example, to purge archive logs greater than 6 days old:

    find /logdir/arch -name “arch*” -mtime +6 -exec rm {} \;

  • Problem: How can we similarly purge the old archive logs on NT?


Archive logs solution l.jpg

Archive Logs: Solution

  • On NT, there is apparently no ‘find’ command, as in Unix.

  • The archive logs can be found indirectly, because the database ‘knows’ when/where they were written.

  • Logs listed in V$ARCHIVED_LOG

  • Example, to remove logs older than 2 days:

    spool del_logs.sql

    select 'del ‘ || name from v$archived_log where completion_time < (sysdate - 2);

    spool off

    host del_logs

  • Similar query could be used to copy archive logs to backup.


Sql plus madness l.jpg

10

SQL*Plus Madness

  • Scenario: Connection ‘hangs’ upon trying to connect in SQL*Plus.

  • Also get error message re DBMS_APPLICATION_INFO

  • DBA investigates. He finds:

    1) Connect via Svrmgrl ok;

    2) System user connects ok

  • Problem: How can DBA ‘fix’ SQL*Plus?


Sql plus madness solution l.jpg

SQL*Plus Madness: Solution

  • Package DBMS_APPLICATION_INFO is used to ‘register’ a running application, for tracking/analysis purposes.

  • Once an application is ‘registered’ it will show up in v$session in ‘module’ field.

  • SQL*Plus is one of few applications that try to ‘register.’ This explains why svrmgrl still works.

  • This all points to problem with the ‘register’ of SQL*Plus.

  • The set-up script to build necessary tables is called ‘pupbld.sql’


Sql plus madness solution continued l.jpg

SQL*Plus Madness: Solution(continued)

  • Action: Try to run pupbld.sql as SYSTEM but it hangs!

  • Investigation shows: product_profile table missing, but synonym still there!

  • Solution: drop public synonym for product_profile, then rebuild pupbld. All OK now.


Contact information l.jpg

Contact Information

Chris Lawson

[email protected]

http://www.dbspecialists.com

Database Specialists, Inc.

388 Market Street, Suite 400

San Francisco, CA 94111


  • Login