ten more database mysteries
Download
Skip this Video
Download Presentation
Ten More Database Mysteries

Loading in 2 Seconds...

play fullscreen
1 / 27

Ten More Database Mysteries - PowerPoint PPT Presentation


  • 140 Views
  • Uploaded on

Ten More Database Mysteries. Chris Lawson Database Specialists, Inc. www.dbspecialists.com [email protected] 1. The Case of the Unwanted Services. DBA detects listener starting extra services NOT defined! > lsnrctl status Services Summary …

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Ten More Database Mysteries' - sawyer


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
ten more database mysteries
Ten More Database Mysteries

Chris Lawson

Database Specialists, Inc.

www.dbspecialists.com

[email protected]

the case of the unwanted services

1

The Case of the Unwanted Services
  • DBA detects listener starting extra services NOT defined!

> lsnrctl status

  • Services Summary…
  • database1 has 1 service handler(s)
  • database2 has 1 service handler(s)
  • databasez has 5 service handler(s)
  • They appear to be harmless, but what are these extra services?
  • Restarted Listener. At first all OK, then 5 extra services activated.
  • Confirmed that listener.ora file does NOT list the unwanted database.
  • DBA confirmed using correct listener.ora file.
  • Puzzle: There is no entry whatsoever for databasez.
unwanted services solution
Unwanted Services: Solution
  • Multiple listener services is indicative of MTS (Multi-threaded Server) Dispatchers; however, MTS was not being used!
  • DBA recalled that databasez was actually on a different server.
  • Solution: Database on different server started MTS services.
  • Key init.ora parameter:

MTS_LISTENER_ADDRESS = "(ADDRESS=(PROTOCOL=tcp)(HOST=myhost)(PORT=1526))"

  • Services match init.ora parameter MTS_DISPATCHERS.
  • How it happened: The remote init.ora file had originated on ‘our’ server, and the MTS parameter entries were not removed.
  • Resolution: On other server, DBA corrected the init.ora entry, bounced database. Extra services automatically stopped.
  • But - that’s not all...
unwanted services the rest of the story
Unwanted Services: The Rest of the Story
  • These extra services turned out to be not so “harmless” after all.
  • The ‘unwanted’ services are fully functional; they will intercept connection requests and redirect them to a different server!
  • Production users were redirected to a “clone” testing database on another server, and used it for several hours!
  • This occurs because each MTS service points to a particular dispatcher on a particular server.

> lsnrctl services

Services Summary...

demo has 3 service handler(s)

D001 <machine: hohp2, pid: 3385> (ADDRESS=(PROTOCOL=tcp)(DEV=17)(HOST=[1.2.3.4)(PORT=1179))

  • Perhaps could be used for rapid failover to standby database?
incognito dba privileges

2

Incognito DBA Privileges
  • Application testers need to “refresh” the test database often.
  • They will need to start/shut database as the oracle user.
  • DBA, being suspicious (justly) by nature, is reluctant to provide oracle account.
  • As punishment, users frequently ask DBA to run the refresh.
  • Question: How can developers perform tasks as oracle, without actually having the oracle account?
incognito dba privileges solution
Incognito DBA Privileges: Solution
  • Create script to refresh database, including startup, shutdown, etc.
  • UNIX setuid feature allows programs to run with another identity. But setuid feature does not propagate to commands within the script.
  • Trick: Use a ‘wrapper’ C program that runs as oracle and calls the refresh script:

# include <stdio.h>

main ()

{

system (“./refresh.ksh”);

}

  • Activate setuid for the wrapper program: chmod 6711 refresh
  • Users simply enter refresh to run the refresh program as oracle.
  • Thanks to Brian Keating of Database Specialists for this solution.
nt remote control

3

NT Remote Control
  • NT box is running an important database. It appears that the Listener is down.
  • NT server is remote.
  • Question: How can you check the listener remotely?
nt remote control solution
NT Remote Control: Solution
  • Use ‘SC’ command. It is part of the NT Resource Kit.
  • The SC utility is very similar to the NET commands.
  • First enable a security "context" to the remote box; e.g. map a network drive to the remote server, and provide the administrator account/password.
  • The network drive may be disconnected after SC use is done.
  • Note: A mapped network drive is not required if you are connected locally as administrator, and the remote server uses the same administrator password.
  • Run SC using the format: SC \\[SERVER] [CMD] SERVICE
nt remote control solution continued
NT Remote Control: Solution (continued)

Some common SC commands are:

query--------Queries the status for a service

start--------Starts a service

pause--------Sends a PAUSE control request

continue-----Sends a CONTINUE request

stop---------Sends a STOP request to a service

config-------Changes the configuration of a service

qc-----------Queries the configuration information

delete-------Deletes a service (from the registry)

create-------Creates a service (add to registry)

nt remote control solution continued10
NT Remote Control: Solution (continued)

EXAMPLE:

Check status of the Oracle 8 LISTENER service.

NT server named ‘TSUNAMI’

SC \\TSUNAMI QUERY ORACLETNSLISTENER80

SERVICE_NAME: ORACLETNSLISTENER80

TYPE : 10 WIN32_OWN_PROCESS

STATE : 4 RUNNING (STOPPABLE,NOT_PAUSABLE,IGNORES_SHUTDOWN)

WIN32_EXIT_CODE : 0 (0x0)

SERVICE_EXIT_CODE : 0 (0x0)

CHECKPOINT : 0x0

WAIT_HINT : 0x0j

the sad case of the failing failover

4

The Sad Case of the Failing Failover
  • Very critical Customer Support application for large HMO.
  • 24 x 7 crucial. Millions of monthly customers interactions.
  • IBM RS-6000 High-availability cluster (HACMP)
  • Scenario: Sudden hardware glitch causes failover to backup node.
  • Secondary node fails to start database! Database complains of missing datafile.
  • Critical application is down!
  • Objective: Early-rising DBA (from different project) not familiar with the setup needs to get database up!
the failing failover solution
The Failing Failover: Solution
  • Investigation shows all file systems mounted. Bad News: file in question is simply not there.
  • Good News: File is not supposed to be there--only a symbolic link.
  • Each node has symbolic links pointing from “nice-looking” paths to the actual location of raw device:

/u04/oradata/data01.dbf > /dev/r123

  • The primary node had all the links; the backup node was missing 2 symbolic links. Simply creating the links allowed database startup.
the failing failover solution continued
The Failing Failover: Solution(continued)
  • Moral of the story: High reliability hardware only as good as the weakest link--including maintenance requirements.
  • This strategy was very susceptible to human error. Every new .dbf file required new symbolic link on both nodes. One omission completely destroyed the high-availability plan.
the case of the impatient insurance agent

5

The Case of the Impatient Insurance Agent
  • Query to find details for expired insurance policies only returns 50 rows, but takes 2 minutes to finish.
  • Query is simple join of 2 tables: Response and Policy

5 where-clause ‘filters’ 1 ‘filter’

result set: 20,000 rows result set: 15,000 rows

RESP

POL

policy_id

result:

only 50 rows

  • Puzzle: How can query ever be quick, since the ‘super filter’ to reduce set to 50 rows is spread across two tables ?
impatient agent solution
Impatient Agent: Solution

The Trick:

1) Perform pre-processing that uses filters, but retrieves only the key (policy_id) that is used as the join column.

2) This field is retrieved via index read only--thus avoiding the table reads that account for the delays.

3) Use this result set as the starting point for the original join.

pre-processing driving “table”

original join

RESP

POL

RESP

POL

index reads only

table reads on small set only

daffy database links

6

Daffy Database Links
  • Version 7.3.4 database. Database links working normally. Database has been running straight for 3 weeks.
  • Suddenly, the links fail: ORA-12154: TNS:could not resolve service name
  • Investigation shows tnsnames.ora file unchanged. Link definition unchanged.
  • Listener is running Oracle 8.0.4. Adding entry to tnsnames.ora file in Oracle 8 directory causes the links to work again!
  • Tests with new links show that the tnsnames.ora file for 7.3.4 is being ignored!
  • Further, if Oracle 8 tnsnames file is removed, the one in Oracle 7 is used instead!
database links solution
Database Links: Solution
  • Clue: Listener had been restarted recently. But why should the Listener process (a server side function) affect finding the tns alias?
  • Metalink analyst suggested checking TNS_ADMIN; but how is that relevant? We are not creating a UNIX session.
  • Oracle Note 37808.1 clears-up the confusion:

For database links, TNS_ADMIN takes its value from

the value defined when the listener was started

  • That is, a client process (link) is influenced by a server-side process!
  • The listener had been started most recently with TNS_ADMIN set to 8.0.4, causing links to look in 8.0.4 tns file.
  • Unsetting TNS_ADMIN, then restarting listener caused all operations to run as expected.
all primary keys are equal but some are more equal than others

7

All Primary Keys are Equal(But some are more equal than others)
  • Background: Application tuning often requires DBA to ‘lure’ optimizer into using certain indexes. Hints not always successful or possible.
  • Optimizer ‘likes’ certain indexes, because they typically are faster than others. The favorite: an index on Primary Key.
  • Scenario: 20 gigabyte insurance billing system. Oracle 8.0.4.
  • Particular query joins to a table called POLICY. Joinuses the PK index (policy_ID).
  • Query speed-up requires that join use new index on (PK, other col).
  • Created new double index--but optimizer would not cooperate. It foolishly insists on using the PK index, even if hint used.
  • Question: How can we get the optimizer to obey the DBA?
all primary keys are equal solution
All Primary Keys are Equal: Solution
  • We need a way to ‘trick’ the optimizer into not using the PK index, but instead, use our ‘extra column’ index.
  • This is difficult, because the PK index is the #1 choice.
  • Trick: ‘Disguise’ the PK index as a unique index.
  • Oddity: If unique index already exists, addition of a matching PK will use the existing index.
  • In determining execution plan, optimizer will treat the new index as if it were a unique index, not a PK index.
  • So, drop the PK, build a matching unique index, then rebuild the PK.
  • Optimizer no longer stubbornly insists on using this ‘PK’ index, because it is not a true, pedigreed PK index--merely a unique index.
the sad case of the homeless archive logs

8

The Sad Case of the Homeless Archive Logs
  • Scenario: Large medical application; hundreds of connections into database. This is a critical, 24x7 server.
  • Users suddenly complain of database ‘hanging.’
  • Alert log shows database unable to write archive log.

ORA-00255: error archiving log 1 of thread 1, sequence # 200

ORA-00270: error creating archive log /demo/arch/1_200.dbf

ORA-19504: failed to create file "/demo/arch/1_200.dbf"

  • Investigation reveals disk crash on disk housing archive logs.
  • No .dbf files on that disk--only archive logs.
  • Problem: How can database operation be resumed, with minimal disruption to hundreds of users?
homeless archive logs solution
Homeless Archive Logs: Solution
  • DBA decided to dynamicallyredirect the archive logs, thereby avoiding need to shutdown database.
  • To change destination of archive logs:

alter system archive log start to ‘[new path]’

  • For example:

SVRMGRL> alter system archive log start to \'/demo/arch2/\'

  • Once the ‘backlog’ of archive logs is corrected, database automatically resumes normal operation.
  • Alert log now shows:

Fri Feb 11 09:00:47 2000

ARCH: Archival started

Archiver process freed from errors. No longer stopped.

the puzzling affair of the old archive logs

9

The Puzzling Affair of the Old Archive Logs
  • With database in Archive Mode, old archive logs must be purged.
  • On Unix, this is simple, using the find command (in cron).
  • For example, to purge archive logs greater than 6 days old:

find /logdir/arch -name “arch*” -mtime +6 -exec rm {} \;

  • Problem: How can we similarly purge the old archive logs on NT?
archive logs solution
Archive Logs: Solution
  • On NT, there is apparently no ‘find’ command, as in Unix.
  • The archive logs can be found indirectly, because the database ‘knows’ when/where they were written.
  • Logs listed in V$ARCHIVED_LOG
  • Example, to remove logs older than 2 days:

spool del_logs.sql

select \'del ‘ || name from v$archived_log where completion_time < (sysdate - 2);

spool off

host del_logs

  • Similar query could be used to copy archive logs to backup.
sql plus madness

10

SQL*Plus Madness
  • Scenario: Connection ‘hangs’ upon trying to connect in SQL*Plus.
  • Also get error message re DBMS_APPLICATION_INFO
  • DBA investigates. He finds:

1) Connect via Svrmgrl ok;

2) System user connects ok

  • Problem: How can DBA ‘fix’ SQL*Plus?
sql plus madness solution
SQL*Plus Madness: Solution
  • Package DBMS_APPLICATION_INFO is used to ‘register’ a running application, for tracking/analysis purposes.
  • Once an application is ‘registered’ it will show up in v$session in ‘module’ field.
  • SQL*Plus is one of few applications that try to ‘register.’ This explains why svrmgrl still works.
  • This all points to problem with the ‘register’ of SQL*Plus.
  • The set-up script to build necessary tables is called ‘pupbld.sql’
sql plus madness solution continued
SQL*Plus Madness: Solution(continued)
  • Action: Try to run pupbld.sql as SYSTEM but it hangs!
  • Investigation shows: product_profile table missing, but synonym still there!
  • Solution: drop public synonym for product_profile, then rebuild pupbld. All OK now.
contact information
Contact Information

Chris Lawson

[email protected]

http://www.dbspecialists.com

Database Specialists, Inc.

388 Market Street, Suite 400

San Francisco, CA 94111

ad