Optimal Discretization of Random Variables Using Normal Distribution Mapping

Supplementary Text 2

Supplementary Text 2 If the relationship between two random variables X and Y is approximately illustrated in the following figure, then how can we discretize the sample values of them? N(μ1,σ1) Fig. A μ1 t1,1 t1,2 t1,3 t1,4 Fig. B N(μ2,σ2) t2,1 t2,2 t2,3 t2,4 μ2

To uniformly describe the relation ship between X and Y, we can map them to normal distribution. And the corresponding relationship would be change into that illustrated in the figure below. Fig. C N(0,1) 0 t1 t2 t3 t4 Fig. D N(0,1) 0 t1 t2 t3 t4

The data transformation between Figure A and C is implemented with the Equation (14) in the paper. And the data transformation between Figure B and D is also implemented with Equation (14) in the paper. Then our goal is to find the best common partitioning points t1, t2, …, tk. The Purpose of Equation (15) in the paper is to determine the best common partitioning points for both variable X and Y, while simultaneously making both of them as evenly partitioned as possible. The structures of the variables shown in Figure B and D reveal the essence of the relationship between the two random variables of the same kind. This is of more importance when the number of random variables becomes large. It makes the partition not “casual” just in order to adapt to the data, some of which are just noise.

Example 3 The gene expression data is partially listed below. Suppose that Gene #1 and #2 have been divided in the same group G1, while Gene #3, #4, and #5 are in the same group G2.

For the data in the table above, we define 15 vectors as follows.

To compute the mutual information between the two groups G1 and G2about low pressure treatment described in Equation (17) in the paper, we can use the following detailed equation in stead of (17):

To compute the mutual information between the two groups G1 and G2about high pressure treatment described in Equation (18) in the paper, we can use the following detailed equation in stead of (18):

To compute the mutual information between the two groups G1 and G2about bothhigh pressure and the high pressure treatment described in Equation (19) in the paper, we can use the following detailed equation in stead of (19):

To compute the mutual information between the treatment low pressure and the treatment high pressure within the group G1described in Equation (20) in the paper, we can use the following detailed equation in stead of (20):

To compute the mutual information between the treatment low pressure and the treatment high pressure within the group G2described in Equation (20) in the paper, we can use the following detailed equation in stead of (20):

Optimal Discretization of Random Variables Using Normal Distribution Mapping

Optimal Discretization of Random Variables Using Normal Distribution Mapping

Presentation Transcript

Supplementary Material 2

Supplementary figure 2

Supplementary Figure 2.

Supplementary Fig. 2

Supplementary figure 2

Supplementary Table 2.

Supplementary table 2

Supplementary Figure 2

Supplementary Figure 2

Supplementary Figure 2

Supplementary Fig. 2

Supplementary Fig. 2:

Supplementary Figure 2

Supplementary Table 2.

Supplementary table 2.

Supplementary information 2

Supplementary Figure 2

Supplementary Figure 2

Supplementary Fig 2:

supplementary figures 2

Supplementary Figure 2

Supplementary Figure 2