Finding eliminating rogue hex characters in text fields
This presentation is the property of its rightful owner.
Sponsored Links
1 / 14

Finding & Eliminating Rogue Hex Characters in Text Fields PowerPoint PPT Presentation


  • 43 Views
  • Uploaded on
  • Presentation posted in: General

Finding & Eliminating Rogue Hex Characters in Text Fields. Martha Cox Cancer Outcomes Research Program CDHA / Dalhousie. The Problem. Chart abstraction data containing several comment fields (255 chars each) Some values with "random" line feeds.

Download Presentation

Finding & Eliminating Rogue Hex Characters in Text Fields

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Finding eliminating rogue hex characters in text fields

Finding & Eliminating Rogue Hex Characters in Text Fields

Martha CoxCancer Outcomes Research Program

CDHA / Dalhousie


The problem

The Problem

Chart abstraction data containing several comment fields (255 chars each)

Some values with "random" line feeds


Finding eliminating rogue hex characters in text fields

Patient ID Comments

--------------------------------------------------------------------------------------------

013 Found hyperplastic polyp

--------------------------------------------------------------------------------------------

017 colonscopy performed in Bridewater - showed a large rectal tumor as well as

multiple polyps throughout the colon.

--------------------------------------------------------------------------------------------

028 Pt did not have surgery.

Biopsy from endoscopy came back as moderatley

differentiated adneocarcinoma

--------------------------------------------------------------------------------------------

031 -Pt. had moderately severe sigmoid diverticulosis and agulated sigmoid colon

and hepatic flexure.

-No evidence of intraluminal tumor at this point.

--------------------------------------------------------------------------------------------

038 report not present.

--------------------------------------------------------------------------------------------

040 office sigmoidscopy done in April, 2003 and was found to be normal.

A second

sigmiodscopy was done in sx.

--------------------------------------------------------------------------------------------

056 colonscopy confirmed the presence of a low-lying carinoma of the rectum.

--------------------------------------------------------------------------------------------

084 lap attempted X 2 but resection could not be carried out.

Most questions N

A for laparatomy.

--------------------------------------------------------------------------------------------

155 had a hemicolectomy

--------------------------------------------------------------------------------------------

157 3

4 tumor above reflection, 1

4 was below reflection

--------------------------------------------------------------------------------------------


Finding eliminating rogue hex characters in text fields

So I emailed my SAS buddies...


Lots of suggestions

Lots of suggestions

  • compress? kcompress?Returns seem to be between words. Compress would smash 2 words together.

  • translate or tranwrd?Should work, but these wouldn't take a hex value for me.

    Besides, which character(s) is the problem?


How to find the bad word

data charlist;

set shrug.sample1

(where=(PATIENT in (28)));

length single singlhex $1;

loopx = length(trim(COMMENT));

do i = 1 to loopx;

single = substr(COMMENT, i, 1);

singlhex = single;

output;

end;

keep single singlhex;

run;

How to find the Bad Word


Patient 28 s comment one char at a time

Obs single singlhex

20 g 67

21 e 65

22 r 72

23 y 79

24 . 2E

25 20

26

0D

27 0A

28

0D

29 0A

30 B 42

31 i 69

Patient 28's comment, one char at a time


Repair program

data shrug.sample2;

set shrug.sample1;

badword = trim('0D'x) || left('0A'x);

goodword = ' ';

COMMENT = tranwrd(COMMENT,

badword,

goodword);

drop badword goodword;

run;

Repair Program


Results

Results

Patient ID Comments

--------------------------------------------------------------------------------------------

013 Found hyperplastic polyp

--------------------------------------------------------------------------------------------

017 colonscopy performed in Bridewater - showed a large rectal tumor as well as

multiple polyps throughout the colon.

--------------------------------------------------------------------------------------------

028 Pt did not have surgery. Biopsy from endoscopy came back as moderatley

differentiated adneocarcinoma

--------------------------------------------------------------------------------------------

031 -Pt. had moderately severe sigmoid diverticulosis and agulated sigmoid colon

and hepatic flexure. -No evidence of intraluminal tumor at this point.

--------------------------------------------------------------------------------------------

038 report not present.

--------------------------------------------------------------------------------------------

040 office sigmoidscopy done in April, 2003 and was found to be normal. A second

sigmiodscopy was done in sx.

--------------------------------------------------------------------------------------------

056 colonscopy confirmed the presence of a low-lying carinoma of the rectum.

--------------------------------------------------------------------------------------------

084 lap attempted X 2 but resection could not be carried out. Most questions N

A for laparatomy.

--------------------------------------------------------------------------------------------

155 had a hemicolectomy

--------------------------------------------------------------------------------------------

157 3

4 tumor above reflection, 1

4 was below reflection

--------------------------------------------------------------------------------------------


Finding eliminating rogue hex characters in text fields

Hmm...

  • Noticed that the breaks seemed to occurring where one might have used a slash (“/”).

  • Working in a VMS batch environment; no Display Manager.

  • Looking at the data via PROC REPORT with “flow” for the comments column.

So, is this a data problem or a reporting problem?


Finding eliminating rogue hex characters in text fields

after much digging through SAS manuals...


Finding eliminating rogue hex characters in text fields

The Answer!

Split character in PROC REPORT

  • not just for column headers

  • also used to split long text values in the body of the report

  • default character is slash


Final results

Final Results

Patient ID Comments

--------------------------------------------------------------------------------------------

013 Found hyperplastic polyp

--------------------------------------------------------------------------------------------

017 colonscopy performed in Bridewater - showed a large rectal tumor as well as

multiple polyps throughout the colon.

--------------------------------------------------------------------------------------------

028 Pt did not have surgery. Biopsy from endoscopy came back as moderatley

differentiated adneocarcinoma

--------------------------------------------------------------------------------------------

031 -Pt. had moderately severe sigmoid diverticulosis and agulated sigmoid colon

and hepatic flexure. -No evidence of intraluminal tumor at this point.

--------------------------------------------------------------------------------------------

038 report not present.

--------------------------------------------------------------------------------------------

040 office sigmoidscopy done in April, 2003 and was found to be normal. A second

sigmiodscopy was done in sx.

--------------------------------------------------------------------------------------------

056 colonscopy confirmed the presence of a low-lying carinoma of the rectum.

--------------------------------------------------------------------------------------------

084 lap attempted X 2 but resection could not be carried out. Most questions N/A

for laparatomy.

--------------------------------------------------------------------------------------------

155 had a hemicolectomy

--------------------------------------------------------------------------------------------

157 3/4 tumor above reflection, 1/4 was below reflection

--------------------------------------------------------------------------------------------


Any questions

Any questions ?


  • Login