120 likes | 252 Views
This comprehensive guide provides detailed steps to set up and execute a master-slave architecture for word count analysis using Java. It covers the entire process, from initial login and running abstracts locally to configuring and starting both master and slave nodes. You'll learn how to manage directories, run configurations, handle datasets effectively, and address performance considerations. Ideal for developers and data analysts looking to implement distributed computing solutions with Java.
E N D
PalabrasArchiecture … Master1 Master2 Master3 MasterN Slave1 Slave2 Slave3 … SlaveN Directory Jobs3 Jobs1 Jobs2 JobsM … Slave1 Slave2 Slave3 SlaveM
Step 1: Get Started • Login: • Username: nombre\cc5212 • Password on board • http://aidanhogan.com/teaching/cc5212-1/mdp-lab2.zip • C:/Program Files (x86)/eclipse/ (in Spanish ) • File > Import > … • http://aidanhogan.com/teaching/cc5212-1/mdp-lab2-data/
Step 2: Run Locally • ~600.000 abstracts • ~52.340.000 non-unique words • ~320 MB uncompressed • org.mdp.cli.RunWordCountLocally • Right Click > Run As > Run Configurations > Arguments • -i<path>/abstracts-es.txt.gz -igz –k 500 How long will it take? Will it even run? -Xmx256M
Step 3: Start the Directory • I start the directory! • vm116.dcc.uchile.cl (172.17.69.190) • Port 1985 Remind me to set heap-space
Step 4: Prepare Slave org.mdp.cli.StartWordCountSlave • Implement openDirectoryStub() • Add the slave’s name to the directory • Review the other code
Step 5: Run Slave Build the .jar using build.xml(dist) Open cmd and go to directory java –jar –Xmx256M mdp-2.jar StartWordCountSlave –dn vm116.dcc.uchile.cl –dp 1985 –sn <username>
Step 6: Prepare Master org.mdp.cli.StartWordCountMaster • Connect to the directory • Get the list of slaves from the directory • Clear words from the slave for you • Choose a slave for each word • Send the add-words job to each slave
Step 7: Run Master • For small dataset! • org.mdp.cli.StartWordCountMaster • Right Click > Run As > Run Configurations > Arguments • -i<path>\es-abstracts-10k.txt.gz -igz-dp 1985 -dn vm116.dcc.uchile.cl -mn <username> -k 500
Step 8: Run Big Master • For big dataset! • org.mdp.cli.StartWordCountMaster • Right Click > Run As > Run Configurations > Arguments • -i <path>\es-abstracts.txt.gz -igz-dp 1985 -dn vm116.dcc.uchile.cl -mn<username> -k 500
Step 9: Run Distribution Locally • Start a directory server • Build and use the jar • java -jar mdp-2.jar StartRegistryAndServer -n localhost-p 1985 -r -s 1 -sp • Start 4 slaves (give different names) in four different CMD windows • Use the jar • java -jar mdp-2.jar StartSlave -dnlocalhost-dp1985 –wn <usernameN> • Start a master • Can use Eclipse or jar (as preferred) • Point it to local directory • Use small file (large file if successful) -Xmx256M
Final Step: Teach Me Spanish Ask me words in the top 500!