1 / 10

Redoing ratio analysis with clustering techniques

Applying machine learning techniques to identify financial reporting peer firms through clustering analysis of financial ratios. Enhances fraud detection models.

ahuffman
Download Presentation

Redoing ratio analysis with clustering techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Redoing ratio analysis with clustering techniques Kexing Ding Xuan Peng Miklos Vasarhelyi Yunsen Wang

  2. Peerselectionwithclustering • Weapplymachinelearningtechniquesonfinancialratiostoidentifyfinancialreportingpeerfirm. • Financialreportingpeerfirms:firmsthathavesimilarfinancialreportingqualities. • Observation:SICcodesarecommonlyused • Obsolete • Infrequentupdate • Focusonproductionprocess

  3. Financialreportingqualitymodels • Beneish(1997,1999): • Financialreportingqualityisassociatedwitheightratios • Eitherthe accounting distortion or the preconditions that encourage managers to conduct misreporting. • Widelyusedinacademicandinpracticeeversince • Sales in receivables (Receivables/Sales), • Gross Margin (Sales-Costs of Goods Sold/Sales), • Asset Quality (1 - (Current Assets + PPE)/Total Assets), • Sales Growth (Salest/Salest-1), • Depreciation Rate (Depreciation/NetPPE), • SGA Rate (Sales, general, and administrative expenses/Sales), • Leverage (Total Debt/Total Assets), • Accruals (Total Accruals/ Total Assets).    

  4. Clustering method • Clustering analysis is one of the data mining methodologies that groups a set of objects in such a way that objects in a same cluster is more similar to each other than to those in the other clusters. • The K-medians algorithm operates on a set (X) of n points. It chooses k centers { from X and form k clusters { that the sum of the distances from each  to the center of its clusters ci () is minimized. • An observation with eight financial ratios can be viewed as a point () in an eight-dimensional space. • Everyyear,theclusteringprocedureassignseachpointinoneofthe300clusters.

  5. Results(1) • Clusteringsuccessfullyfindsfirmswithsimilarfinancialratios. • Lowwithin-groupdeviation • Highinter-groupdeviations

  6. Results(2) • Enhanceaccountingfrauddetection model • Dechow(2011)F-scoremodel: • F-score = The predicted probability / the unconditional expectation of accounting fraud • The higher the F-score, the more suspicious • Revised model:

  7. RankthefirmsbasedonF-score: Incorporating firms’ deviation from financial reporting peer firms enhances the ability of F-score to detect fraud firms. Model 1: Dechow (2011) Model Model 2a: Revised Model

  8. 1.WerankthefirmsbasedonF-score: The ability of F-score derived from the SIC 4-digit (NAIC 4-digit) classification to detect fraud is similar to the original Dechow et al. (2011) model. . Model 2a: Revised Model with clustering Model 2b and 2c are Revised models with SIC 3-digit and NAIC 4-digit

  9. Type I error is calculated as the percentage of non-fraud observations that are incorrectly predicted as fraud. • Type II error is calculated as the percentage of fraud observations that are incorrectly predicted as non-fraud by the model. • Type1errorreducesfrom40.8%to34.7% • Type2 error reduces from32.4% to28.8%.

  10. Thankyou!

More Related