slide1
Download
Skip this Video
Download Presentation
計算機組織與組合語言

Loading in 2 Seconds...

play fullscreen
1 / 31

計算機組織與組合語言 - PowerPoint PPT Presentation


  • 159 Views
  • Uploaded on

計算機組織與組合語言. Teacher : cyy P resenter : B98902071 康秩群. Outline of this slide. The 0 -bits counting problem Naïve algorithm Querying table approach Counting 1’s and subtracted by 16 Eliminating algorithm Parallelly counting algorithm Other improvement skills. 數圈圈問題.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' 計算機組織與組合語言' - jered


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

計算機組織與組合語言

Teacher : cyy

Presenter : B98902071 康秩群

outline of this slide
Outline of this slide
  • The 0-bits counting problem
    • Naïve algorithm
    • Querying table approach
    • Counting 1’s and subtracted by 16
      • Eliminating algorithm
      • Parallelly counting algorithm
    • Other improvement skills
slide3
數圈圈問題
  • Input: An array of 16-bits integers. (Size of the array is no more than 32)
  • Output: The amount of 0-bits in the array.
na ve algorithm
Naïve algorithm

b = 0;

do {

d = a[--c];

r2 = 1;

do {

if (d & r2 == 0)

b++;

r2 <<= 1;

} while (r2 != 0);

} while (c > 0);

return b;

time complexity
Time complexity

The above algorithm runs in

[amount of 0-bits] *5 + [amount of 1-bits] *4 + [email protected]#$

= O(5n) = O(n)

performance
Performance
  • 4032 clocks
    • Rank #39
querying table approach
Querying table approach

0110110110011110

group1group2group3group4

0000 : 4 0’s

0001 : 3 0’s

0010 : 3 0’s

0011 : 2 0’s

………………..

1111 : 0 0’s

constructing table
Constructing table

int C0[16]={4,3,3,2,

3,2,2,1,

3,2,2,1,

2,1,1,0};

querying table
Querying table

do {

d = a[--c];

b += C0[d & 0xF];

d >>= 4;

d &= 0xFFF;

b += C0[d & 0xF];

d >>= 4;

b += C0[d & 0xF];

d >>= 4;

b += C0[d & 0xF];

} while (c > 0);

time complexity1
Time complexity

The above algorithm runs in

[amount of integers] *24 + 42(constructig table)

= O( (24/16)n ) = O(1.5n) = O(n)

performance1
Performance
  • 1578 clocks
    • Rank #18
counting 1 s and subtracted by 16
Counting 1’s and subtracted by 16
  • You can construct a larger table such as C0[64], and divide the integer to 6-6-4.
  • Run time is no less than ¾ of the above algorithm (>=1200).
  • How about another view point that count 1’s and then subtracted by 16.
  • There many interesting algorithms!
eliminating algorithm
Eliminating algorithm

while (n){

count++;

n &= n-1;

}

  • Twocases : (1.) *************1 (2.)*****10...0000
case 1
Case 1
  • n = *************1
  • n-1 =*************0

-----------------------

  • n&n-1 = *************0
  • A one was eliminated.
case 2
Case 2
  • n = *****10...0000
  • n-1 =*****01...1111

-----------------------

  • n&n-1 = *****00...0000
  • A one was eliminated.
eliminate a 1 each round
Eliminate a 1 each round
  • When n is eliminated to zero, that’s the end!
implement
Implement

b = c << 4; //c * 16

do {

d = a[--c];

while (d){

b--;

d &= d - 1;

}

} while (c > 0);

return b;

time complexity2
Time complexity

The above algorithm runs in

[amount of 1-bits] *5 + [size of array] *4 + 4

= O( [5 + (4/16)]n ) = O(5.25n) = O(n)

performance2
Performance
  • 2100 clocks
    • Rank #27
  • Slower? It depends on the amount of 1’s.
  • It’s faster then the above before rejudge.
    • Obviously, the amount of 1-bits was increased.
  • But the code is short, good to do other things.
parallelly counting algorithm
Parallelly counting algorithm
  • Similar as the others

00(0) 0 ones →00 – 0 = 00(0)

01(1) 1 ones → 01 – 0 = 01(1)

10(2) 1 ones → 10 – 1 = 01(1)

11(3) 2 ones → 11 – 1 = 10(2)

  • [the original two bit] – [the left bit]
  • then add them all iteratively
parallelly counting algorithm1
Parallelly counting algorithm

do {

x = x - ((x >> 1) & 0x5555);

x = (x&0x3333) + ((x>>2) & 0x3333);

x = (x + (x >> 4));

b -= x & 0xF;

b -= (x >>8) & 0xF;

} while (c > 0);

time complexity3
Time complexity

The above algorithm runs in

[amount of integers] *18 + 9

= O( (18/16)n ) = O(1.125n) = O(n)

performance3
Performance
  • 1224 clocks
    • Rank #10
processing 3 integers
Processing 3 integers
  • 3個數字一組一起算(同阿蹦)
    • 4個bits可表示0~15,但同一組1的數量最多只有4個
    • 故算出每4bits中1的數量後可塞進3組數字(4 * 3 = 12 < 15)
    • 後續動作可一起做,節省兩組的後續計算時間
      • Code有點長就不附上了,有興趣請與我聯繫

1111 → 0100

1111 → 0100

1111 → 0100

-----------------

1100

time complexity4
Time complexity

The above algorithm runs in

ceil (amount of integers/3) *45 + 10

= O( (15/16)n ) = O(0.9375n) = O(n)

performance4
Performance
  • 1090 clocks
    • Rank #7
other improvement skills
Other improvement skills
  • 攤開迴圈
    • 以該code長度可攤開四組(12個數字)
    • 尾端未滿三組須跳出,盡可能將不影響之判斷式移除
      • 可順便測得兩組測資分別為16、32組數字
  • 在main裡直接輸入直接算 (for part #3)
    • 亦可攤開三組(9個數字)
final performance
Final performance
  • Part #2: 1002 clocks
    • Rank #1 (Can run even faster by combining the others’ skills)
  • Part #3: 674 clocks
    • Rank #1
appreciation
Appreciation
  • Thanks for your attention.
  • Thanks for Professor hil’s slides prototype.
ad