finding eliminating rogue hex characters in text fields
Download
Skip this Video
Download Presentation
Finding & Eliminating Rogue Hex Characters in Text Fields

Loading in 2 Seconds...

play fullscreen
1 / 14

Finding & Eliminating Rogue Hex Characters in Text Fields - PowerPoint PPT Presentation


  • 64 Views
  • Uploaded on

Finding & Eliminating Rogue Hex Characters in Text Fields. Martha Cox Cancer Outcomes Research Program CDHA / Dalhousie. The Problem. Chart abstraction data containing several comment fields (255 chars each) Some values with "random" line feeds.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Finding & Eliminating Rogue Hex Characters in Text Fields' - shaman


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
finding eliminating rogue hex characters in text fields
Finding & Eliminating Rogue Hex Characters in Text Fields

Martha CoxCancer Outcomes Research Program

CDHA / Dalhousie

the problem
The Problem

Chart abstraction data containing several comment fields (255 chars each)

Some values with "random" line feeds

slide3
Patient ID Comments

--------------------------------------------------------------------------------------------

013 Found hyperplastic polyp

--------------------------------------------------------------------------------------------

017 colonscopy performed in Bridewater - showed a large rectal tumor as well as

multiple polyps throughout the colon.

--------------------------------------------------------------------------------------------

028 Pt did not have surgery.

Biopsy from endoscopy came back as moderatley

differentiated adneocarcinoma

--------------------------------------------------------------------------------------------

031 -Pt. had moderately severe sigmoid diverticulosis and agulated sigmoid colon

and hepatic flexure.

-No evidence of intraluminal tumor at this point.

--------------------------------------------------------------------------------------------

038 report not present.

--------------------------------------------------------------------------------------------

040 office sigmoidscopy done in April, 2003 and was found to be normal.

A second

sigmiodscopy was done in sx.

--------------------------------------------------------------------------------------------

056 colonscopy confirmed the presence of a low-lying carinoma of the rectum.

--------------------------------------------------------------------------------------------

084 lap attempted X 2 but resection could not be carried out.

Most questions N

A for laparatomy.

--------------------------------------------------------------------------------------------

155 had a hemicolectomy

--------------------------------------------------------------------------------------------

157 3

4 tumor above reflection, 1

4 was below reflection

--------------------------------------------------------------------------------------------

lots of suggestions
Lots of suggestions
  • compress? kcompress?Returns seem to be between words. Compress would smash 2 words together.
  • translate or tranwrd?Should work, but these wouldn't take a hex value for me.

Besides, which character(s) is the problem?

how to find the bad word
data charlist;

set shrug.sample1

(where=(PATIENT in (28)));

length single singlhex $1;

loopx = length(trim(COMMENT));

do i = 1 to loopx;

single = substr(COMMENT, i, 1);

singlhex = single;

output;

end;

keep single singlhex;

run;

How to find the Bad Word
patient 28 s comment one char at a time
Obs single singlhex

20 g 67

21 e 65

22 r 72

23 y 79

24 . 2E

25 20

26

0D

27 0A

28

0D

29 0A

30 B 42

31 i 69

Patient 28's comment, one char at a time
repair program
data shrug.sample2;

set shrug.sample1;

badword = trim('0D'x) || left('0A'x);

goodword = ' ';

COMMENT = tranwrd(COMMENT,

badword,

goodword);

drop badword goodword;

run;

Repair Program
results
Results

Patient ID Comments

--------------------------------------------------------------------------------------------

013 Found hyperplastic polyp

--------------------------------------------------------------------------------------------

017 colonscopy performed in Bridewater - showed a large rectal tumor as well as

multiple polyps throughout the colon.

--------------------------------------------------------------------------------------------

028 Pt did not have surgery. Biopsy from endoscopy came back as moderatley

differentiated adneocarcinoma

--------------------------------------------------------------------------------------------

031 -Pt. had moderately severe sigmoid diverticulosis and agulated sigmoid colon

and hepatic flexure. -No evidence of intraluminal tumor at this point.

--------------------------------------------------------------------------------------------

038 report not present.

--------------------------------------------------------------------------------------------

040 office sigmoidscopy done in April, 2003 and was found to be normal. A second

sigmiodscopy was done in sx.

--------------------------------------------------------------------------------------------

056 colonscopy confirmed the presence of a low-lying carinoma of the rectum.

--------------------------------------------------------------------------------------------

084 lap attempted X 2 but resection could not be carried out. Most questions N

A for laparatomy.

--------------------------------------------------------------------------------------------

155 had a hemicolectomy

--------------------------------------------------------------------------------------------

157 3

4 tumor above reflection, 1

4 was below reflection

--------------------------------------------------------------------------------------------

slide10
Hmm...
  • Noticed that the breaks seemed to occurring where one might have used a slash (“/”).
  • Working in a VMS batch environment; no Display Manager.
  • Looking at the data via PROC REPORT with “flow” for the comments column.

So, is this a data problem or a reporting problem?

slide12
The Answer!

Split character in PROC REPORT

  • not just for column headers
  • also used to split long text values in the body of the report
  • default character is slash
final results
Final Results

Patient ID Comments

--------------------------------------------------------------------------------------------

013 Found hyperplastic polyp

--------------------------------------------------------------------------------------------

017 colonscopy performed in Bridewater - showed a large rectal tumor as well as

multiple polyps throughout the colon.

--------------------------------------------------------------------------------------------

028 Pt did not have surgery. Biopsy from endoscopy came back as moderatley

differentiated adneocarcinoma

--------------------------------------------------------------------------------------------

031 -Pt. had moderately severe sigmoid diverticulosis and agulated sigmoid colon

and hepatic flexure. -No evidence of intraluminal tumor at this point.

--------------------------------------------------------------------------------------------

038 report not present.

--------------------------------------------------------------------------------------------

040 office sigmoidscopy done in April, 2003 and was found to be normal. A second

sigmiodscopy was done in sx.

--------------------------------------------------------------------------------------------

056 colonscopy confirmed the presence of a low-lying carinoma of the rectum.

--------------------------------------------------------------------------------------------

084 lap attempted X 2 but resection could not be carried out. Most questions N/A

for laparatomy.

--------------------------------------------------------------------------------------------

155 had a hemicolectomy

--------------------------------------------------------------------------------------------

157 3/4 tumor above reflection, 1/4 was below reflection

--------------------------------------------------------------------------------------------

ad